Automated CV translation with GenAI

During one such project, our Developers were able to utilise GenAI services available through Azure to significantly speed up translation of consultant CVs, saving hours and speeding up application processes. In this blog post, I interview Torbjörn Sjöberg, one of our Data Scientist Consultants, who was part of the development team, to learn more about the project and the results, with additional comments from others in the team.

In a few months, the GenAI application and web interface were developed by a small team, and the service was made available for testing. The test users reported that the time spent translating CVs was decreased by 1,5 hours on average. Considering that the cost of translating a longer CV is only around $0.2, this is a valuable improvement.

All test users appreciated the service, and they stated they would use it as well as recommend it to others. While the actual time saved per user is small (since consultants often only need to translate their CVs a few times per year), having a faster process potentially brings additional business value, as it can be time critical to produce translated CVs.

Jonas: Could you describe the business problem you addressed?

Torbjörn: Our consultants have CVs in English, but it isn’t uncommon that potential customers request CVs in other languages. The translation and maintenance of CVs falls on the consultants themselves, but it’s our Sales team who are dependent on it. The maintenance and translation can be time-consuming. At times, it can also be critical to have a translated CV ready with short notice. All in all, several people and potentially the entire organisation would benefit from a more efficient solution.

Jonas: And can you describe the solution you developed?

Torbjörn: In the tool that our consultants use to maintain their CVs, there is now a “Translate CV” button. A modal appears and prompts the user to select a language (currently only translation from English to Swedish is available). In the tool, the CVs are already split into parts, such as “Previous projects” and “Education”. Only fields typically containing sentences are translated, whereas fields that typically contain one or a few words are not translated. The translated copy is then available for the consultant to edit, with an attached warning that the CV needs to be manually reviewed.

Behind the scenes, the CV is sent as a JSON object to an AWS Lambda function, a serverless Python script. Each part to be translated is treated separately, first sent to the Azure translation service and then to a GPT4 instance with the prompt to rewrite the translated text in more natural language.

Jonas: What were the conditions that made this project possible? I.e. what needs to be in place for this type of GenAI application to work?

Torbjörn: Disregarding the preexisting tools for CV maintenance, which were a requirement for this project but not related to the GenAI functionality, the only real precondition was that we had Azure’s AI services available to quickly start experimenting, before we built any automated solutions. Then, there were also legal requirements, general such as GDPR and others specific to our use case, which were fulfilled thanks to our preexisting setup being compliant.

Jonas: How did you reach this specific solution? What did the development process look like?

Torbjörn: By the time I got involved, some aspects of the solution were already decided upon, such as the two-step process and which GenAI applications to use in each step. But when I got involved, many of the decisions were purely pragmatic. Which is the simplest solution? Is it good enough? As an example, the reason we translate the chunks separately is because this was the format the CVs were stored in natively. We tested if we could handle the chunks separately to avoid having to concatenate and then split, and it worked well enough for us not to explore more complex solutions.

(To understand how the two-step process was developed, I spoke to Jandeep Singh Malhi, who was part of the project from the start. He described an explorative process of manual testing, comparing results of different models and services available to Solita. Based on the perceived higher translation quality of Azure translation service, but higher writing quality of GenAI, the two-step process was tested and decided upon.)

Jonas: Is there anything in the process you would do differently if you were to build a similar solution?

Torbjörn: One thing that bugs me is that we didn’t set up any specific process to benchmark the solution on the quality of the translation. I would have that as a part of the iterative process. Most time was spent on web development, I think there is a lot that could potentially be improved in the GenAI application, such as further prompt engineering and experimenting further with the two-step solution. The simplest benchmarking strategy would probably have been to have used another LLM for scoring. There are also benchmarking tools available in Azure. This will be even more relevant if we decide to cover more languages – then the development team might not have the ability to judge quality by themselves anymore.

Jonas: You did set up a user test where consultants could try the solution and review it via a form. Was that helpful?

Torbjörn: Well, it is difficult to get the commitment from voluntary user testing to support review and maintenance over time. Benchmarking would also support that aspect. It might have also been helpful to perform user tests earlier in the process, to provide more guidance and opportunity for exploration earlier in the process.

The TL;DR

With the GenAI services available through Azure, experimenting and designing a translation tool adapted to the specific business problem was a fairly quick win. Most of the required work was web development rather than GenAI-related.

As for learnings: a similar project might benefit from putting work into developing proper benchmarking and evaluation at an early stage of the project. All in all, a small team working for a couple of months generated both a working solution to a viable business problem and valuable learnings on how to improve future GenAI projects. Now, we save time and money while having quicker response times to requests requiring CV translation.

Automated CV translation with GenAI

The TL;DR

Author

Search from the site.

Automated CV translation with GenAI

The TL;DR

Author

You might also like

Search from the site.