By 2040, the United Nations has estimated that almost 60% of people will be living in cities across the globe while at the same time the number of trucks, cars, and air kilometres will double and the amount of emissions will keep rising. In addition, millions of people and animals will be killed because of traffic – directly (accidents) and indirectly (emissions).
Through the ages, traffic congestion has been prevented by building new roads and lanes, but this is mostly only a temporary cure as it triggers latent traffic, aka induced traffic. This is a phenomenon where people have decided to travel by car when they otherwise wouldn’t have done so; basically, this is about increased demand, i.e., trips caused by increased supply, that is, road capacity.
In recent years, however, there has been more and more discussion that perhaps data and data-driven solutions could be cost-efficient as well as having long-lasting impacts on making transport greener, more adaptive, and more accessible. Being data-driven or using data doesn’t mean throwing all the existing and working traffic management procedures into the rubbish. On the contrary, it is about having gradual improvements alongside existing ones. Even today, 164 EB (1018) of traffic data per month is generated and hence a range of potential use cases for data solutions exist.
Daunted by traffic and congestions
The challenges of transport are global but one of the worst affected countries when it comes to sustainability, congestion, and safety of transportation is China. Solely in China, there are more than 160 cities, each with over 1 million people, and over 254 million registered vehicles. It goes without saying that, together, urbanisation and motorisation as prevailing trends lead to massive adverse effects such as 1500 megatons of CO2 and almost 1 million emissions-related deaths annually. In addition, the congestion is overwhelming – the average Chinese motorist loses nine days a year by being stuck in traffic. If nothing is done, the future of road transport doesn’t look rosy, or does it?
Data weave traffic management together
Together with a Chinese partner, Enjoyor co., we started a joint pilot research at Hangzhou city in Zhejiang province, China. Enjoyor is a leading Chinese traffic management company with an annual revenue of over 400 million euros and it is responsible for traffic management in many Chinese cities such as Hangzhou, Fuzhou and Nanchang. Hangzhou is the home of the world known Alibaba, and over 13 million people live in the metropolitan area, which is a typical crowded city which is ranked among China’s top ten most congested cities.
The first phase of the pilot aims at predicting traffic speeds in different road sections for the next 10-15 minutes in three-minute intervals by using historical data crunched with sophisticated machine learning algorithms. The pilot area consists of 309 road sections and 118 intersections, while the used data includes several parameters, such as road topology, road section speeds, and signalling schemes. Data will be also enriched with traffic and weather incident data.
The overall, long-term objective of the pilot research activities is to build enabling capabilities to identify traffic phenomena, e.g., congestions in advance, through advanced analytics, and to use the information to manage and optimise traffic at a regional level. Basically, this means that different trigger points indicating the possible emergence of a traffic phenomenon should or could be identified as soon as possible. The thing that makes this both interesting and challenging is the fact that issues impacting traffic vary over time, their relations aren’t fixed, and there are many of them. Hence, traditional management models are no longer suitable and there is a demand for AI-based models which can be continuously updated or can update themselves.
AI spiced digital twin with Cloud and Edge computing
In many cases, we highlight the meaning of data, by having a crystal clear business case and knowing how important it is to have tangible outcomes. In some of the cases, we just experiment, learn, and have a bit of fun; and it is exactly this latter case when you can have the freedom to play around – why not! We just throw rubbish at ideas that someone really knows the solution to the problem, or there is something called business which is able to understand more than normal human beings. Based on the famous Agile manifesto, we could just say we value the unknown more than something we think we know. As people, we are sometimes biased, and we tend to look at things in a very narrow way, especially when it comes to traffic – so wouldn’t building a digital twin of traffic be fun?
The challenge to build smart systems is like Conway’s law: an adage stating that organisations design systems that mirror their own communication structure. And in our work, we are dedicated to trying a model where everyone in the team can time-box a bit of their study and work in parallel, and together vote for the winning ideas so we aren’t building a “one-size-fits” nobody solution. In our pilot research, this was a huge success – in a few days we already had spatial data on visualisations, ongoing data replications going, and TinyML running on Edge. Now comes the interesting part – how to integrate all this together for something you might see in PowerPoints as a future-proof architecture?
It’s the communication, team motivation, and keeping things small when possible that makes it easy to adapt this. The team chose to use Amazon SageMaker Autopilot which can train and optimise hundreds of models based on these algorithms to find the model that is a good candidate for us. At the same time, a few of us were working with AWS SageMaker to run machine learning models to find anomalies of data and anything suspicious – and we immediately found time series data that was fixed (typical on machine learning cases when we lack data). This incorrect way of fixing data resulted in bad models, which wasn’t possible to be detected through using any visualisation or typical data engineering tools.
After a few coffee breaks, the first data API product was available on the AWS serverless API development portal. Taking all this machine learning (EdgeOps) to Edge where resources are very limited can be accomplished using a rule of thumb – by keeping it very small and simple, because running all interference at Edge will bring few benefits like improved latency, security, and resilience. And by the end of the week, we were able to see which parts were common, so we could automate everything by using AWS CDK and keep only those parts that are really required to avoid feature creeps.
So have we built a full-blown digital twin? Not yet, but that doesn’t matter, because we have found relevant feature importances from data assets, and we cannot wait to proceed on to the next step. Setting up even crazy goals and making experiments using new services from hyper scalers like AWS can be the first step to starting something new. Now we have rock solid scaleable Edge, Cloud, and MLOps solutions with a few rock-solid models to teach us something new. Technology capabilities are outstanding, and through that hype, it’s good to remember that a good team with self-belief and trust in it is equally important. And we encourage you to set your data and machine learning models for free!
From tech to business and vice versa
For Solita, V2X is a spearhead project when it comes to a holistic, industrial, internet of things capability development of the whole funnel from mobile vehicle (car, truck, ship, machine) or stationary device (process, production) to an end-user via a Cloud system, including data pipes, real-time data processing, Edge computing, and access control in some cases of data farming when source data is not yet available. Technology development isn’t the main thing, but novelty comes from a collaboration between different people and ecosystems.
In business vice we can offer:
- Land & Scale: When either the use case is about a single application or the whole fleet, we will assure the customer that they will have the application in use within a few weeks. A solid technical feasibility and foundation ensures rapid deployment, and picking low-hanging fruits quickly without compromising any long-term objectives and scalability.
- White label AI for different applications: Combat-tested architecture and services juiced with our ready-made AI features and components can be easily applied (i.e., plug and play) to various use-cases and domains.
- MLOps & Edge computing: Continuous training and automatically maintained and deployed production pipelines for machine learning algorithms running on Edge providing short response times.
- Access to data 24/7: No more immature processes and data silos. Getting rid of data and process silos through the real-time process and device data ingestion to Cloud
- Operational excellence (DataOps) enabled by state-of-the-art information.