Google Cloud aims to open up AI to noncoders with translation, document and vision apps

At its recent Cloud Next events held in New York, San Francisco and Munich, as well as online, Google announced a set of new products and enhancements to existing ones that are broadly focused on offering access to artificial intelligence to more people who do not have data science skills. After all, data scientists remain in short supply, with that role sharing the top spot along with software engineers in 451 Research’s AI & Machine Learning, Use Cases 2022 survey. Both roles were chosen by 23% of respondents when asked which areas of skills shortage were holding back their AI/ML initiatives.

The Take

Google Cloud knows that it is often second or third in line to AWS and Microsoft Corp.’s Azure, and has been making a lot of changes under the stewardship of Thomas Kurian to both bolster its sales and support teams as well as focus on some key areas of what it perceives as differentiation — and machine learning is one of those. While acknowledging that each of the three major U.S. cloud platforms has its strengths, “Googlers” we spoke with at Next in New York reckon the quality of its models is a differentiator in customer engagements, driven in part by its Google Brain and DeepMind research units, from which some of the work that supports these new products originated.

Details

Google Cloud made many announcements at Next, but in this report we focus on the three main ones that have AI and ML at their core.

Vertex AI Vision, which launched in public preview in October, is an application platform for building, deploying and managing computer-vision-based AI applications. Using Vertex AI Vision’s low-/no-code environment, developers can ingest live video streams, add pretrained or custom models, and define the target location for output, for example, in Google’s Big Query warehouse or Vertex AI Vision’s Warehouse. The offering comprises data access and pipelines and low-/no-code tools to enable functions such as object detection, face blurring, personal protective equipment detection and occupancy analytics, with additional features on the way.

It can handle live video streams and is supplemented by a knowledge graph underneath that can help interpret the content of the video, enabling it to support more complicated scenarios such as a person standing in front of a firework display, and identify each item. Google sees opportunities in areas such as physical security, safety and adding intelligence to legacy digital asset management platforms employed by media providers, which often lack sophisticated computer vision capabilities.

Document AI is Google’s tool for extracting insight from unstructured data in documents and turning it into something that can be searched for and analyzed. It includes two components: Document AI Warehouse, a managed service that is now generally available, having been launched into preview last spring; and Document AI Workbench, a tool for building custom ML models that doesn’t require any coding.

Users can start with pretrained models and then upload as few as 20 documents to uptrain the model to their customized needs. There is a “human in the loop” element as well so those business users can review, correct and label data extracted from documents.

Generative AI Software Market Forecast

Translation Hub, an enterprise-grade translation service, is the third of the major announcements. The basic version became generally available in October, while the advanced version is now in preview. The latter includes features such as 135 languages, custom glossaries, translation-quality prediction scores, custom AutoML models in 63 language pairs, and post-editing workflow support, plus the ability to maintain different portals for different departments in an organization.

Key to its utility in business is Translation Hub’s ability to retain the structure of a document being translated so a piece of marketing collateral in English will look the same when its text is translated into, say, Turkish (except for the text, obviously). Additionally, it allows users to score its accuracy and the tool will also let them know where it has doubts about its translation so a workflow can be kicked off for further human involvement.

Another interesting AI-related announcement at Next was the OpenXLA project, an open-source ecosystem designed to ensure that ML models developed deploying frameworks such as TensorFlow, PyTorch and JAX can be executed on various high-performance hardware, such as CPUs, GPUs, TPUs and other types of accelerators. Other vendors in the ecosystem include Advanced Micro Devices Inc., Apple Inc., ARM, Intel Corp., Meta Platforms Inc. and Nvidia Corp. The project is open, so it will be interesting to see if AWS and Microsoft will join.

Want insights on AI trends delivered to your inbox? Join the 451 Alliance.