Satellites play a critical role in global commerce and security, from monitoring supply chains and tracking shipments to supporting coastal defense and disaster response. To reach their full potential, governments deploy AI to analyze satellite data. However, developing new capabilities such as identifying specific vessel types, analyzing port congestion, or monitoring infrastructure requires teams to collect fresh images, secure access to sensitive locations such as ports, and pay human teams to annotate millions of photographs.
Even with substantial resources, building new datasets is impractical and costly. Even successful results remain limited in scope, as edge cases like vessels in adverse weather conditions or containers under varying lighting are difficult and costly to capture with real-world data alone.
Methodology
Over the past year, NTT DATA's Innovation Center collaborated with Bifrost AI to evaluate whether synthetic data could accelerate AI model development while maintaining or improving quality and reducing costs.
Rather than relying exclusively on real-world satellite data, the team co-developed a hybrid approach that combines real and synthetic training data. This methodology enabled the team to rapidly identify model performance gaps and generate targeted synthetic datasets to address specific weaknesses. For instance, when early models showed reduced accuracy in detecting small vessels near coastlines, the partnership generated synthetic training scenarios to improve performance in those conditions.
The pilot was designed to answer a fundamental question for satellite analytics: Can synthetic data meaningfully improve the economics and timeline of AI model development?
Results
The collaboration produced two significant outcomes in controlled testing environments. First, synthetic data generation enabled development teams to produce training datasets for rare scenarios on demand, rather than waiting months for satellite tasking and real-world data collection. This approach demonstrated up to 300x faster iteration speeds in specific test cases, allowing teams to refine models and pursue new customer requirements within days rather than quarters.
Second, synthetic data eliminated many traditional cost drivers associated with satellite AI development, including satellite tasking, restricted location access, and manual image annotation. Since synthetic images are generated with accurate labels automatically, training cycles that previously required months and substantial capital investment could be completed in significantly shorter timeframes. Early estimates suggest potential data acquisition cost reductions of up to 70% in scenarios where synthetic data supplements real-world training sets.
Beyond accuracy metrics, the pilot demonstrated NTT DATA's ability to expand detection capabilities across a broader range of objects and environmental conditions than traditional approaches would economically support. This enables the organization to serve customer needs that were previously impractical due to data collection constraints.
Benefits for stakeholders
The partnership comes at a pivotal moment for satellite-based AI services. Global demand for satellite intelligence continues to expand across defense, logistics, and infrastructure sectors. Organizations that manage port operations can now access detection capabilities for congestion monitoring and vessel tracking without waiting months for data collection. Logistics providers benefit from faster deployment of new analytical models that track shipments and supply chain disruptions in near real-time.
For government agencies responsible for coastal security, maritime safety, and disaster response, the ability to rapidly prototype new detection capabilities represents a meaningful operational advantage. The approach allows service providers to respond to safety-critical and security-sensitive stakeholder requirements with greater agility while operating at more predictable cost structures.
While traditional development approaches require months of data collection for each new capability, synthetic data enables organizations to validate new detection models rapidly and scale secure, trusted services to meet diverse customer needs across maritime zones and operational scenarios.
Note: Some of the figures shared are based on Bifrost's own verification.




