Technology

The Three Biggest Advancements in AI for 2023

Published

5 months ago

December 22, 2023

Komal

The Three Biggest Advancements in AI for 2023

In many ways, the year 2023 marked the beginning of people’s understanding of artificial intelligence (AI) and its potential. That was the year governments started to take AI risk seriously and the year chatbots went viral for the first time. These advancements weren’t so much new inventions as they were concepts and technologies that were coming of age after a protracted gestation period.

However, there were also a lot of fresh inventions. These are the top three from the previous year:

Differentiation

Although the term “multimodality” may sound technical, it’s important to know that it refers to an AI system’s capacity to handle a wide variety of data types, including audio, video, images, and text.

This year marked the first time that robust multimodal AI models were made available to the general public. The first of these, GPT-4 from OpenAI, let users upload images in addition to text inputs. With its ability to “see” images, GPT-4 offers up a plethora of possibilities. For instance, you could ask it to decide what to have for dinner based on a picture of what’s in your refrigerator. OpenAI released the capability for users to communicate with ChatGPT via voice and text in September.

Announced in December, Google DeepMind’s most recent model, Gemini, is also capable of processing audio and images. In a Google launch video, the model was shown using a post-it note with a line drawing to identify a duck. In the same video, Gemini came up with an image of a pink and blue plush octopus after being shown a picture of pink and blue yarn and asked what they could make. (The promotional film gave the impression that Gemini was watching moving images and reacting to voice commands in real time. However, Google stated in a blog post on its website that the video had been trimmed for brevity and that the model was being prompted with text prompts rather than audio and still images, even though the model does have

“I think the next landmark that people will think back to, and remember, is [AI systems] going much more fully multimodal,” Google DeepMind co-founder Shane Legg said on a podcast in October. “It’s early days in this transition, and when you start really digesting a lot of video and other things like that, these systems will start having a much more grounded understanding of the world.” In an interview with TIME in November, OpenAI CEO Sam Altman said multimodality in the company’s new models would be one of the key things to watch out for next year.

Multimodality offers benefits beyond making models more practical. The models can also be trained on a wealth of new data sets, including audio, video, and images, which together contain more information about the world than text can. Many of the world’s leading AI companies hold the view that these models will become more powerful or capable as a result of this new training data. It is a step toward “artificial general intelligence,” the kind of system that can equal human intellect, producing labor that is economically valuable and leading to new scientific discoveries. This is the hope held by many AI scientists.

AI under the Constitution

How to integrate AI with human values is one of the most important unsolved issues in the field. If artificial intelligence and power surpass that of humans, these systems have the potential to unleash immense damage on our species—some even predict its extinction—unless they are somehow restrained by laws that prioritize human well-being.

The method that OpenAI employed to align ChatGPT (in order to steer clear of the racist and sexist tendencies of previous models) was successful, but it necessitated a significant amount of human labor. This method is called “reinforcement learning with human feedback,” or RLHF. If the AI’s response was beneficial, safe, and adhered to OpenAI’s list of content guidelines, human raters would evaluate it and award it the computational equivalent of a dog treat. OpenAI created a reasonably safe and efficient chatbot by rewarding the AI for good behavior and punishing it for bad behavior.

However, the RLHF process’s scalability is seriously questioned due to its heavy reliance on human labor. It costs a lot. It is susceptible to the prejudices or errors committed by certain raters. The longer the list of rules, the greater the likelihood of failure. And it doesn’t seem like it will work for AI systems that get so strong that they start doing things that are incomprehensible to humans.

Constitutional AI, which was initially introduced in a December 2022 paper by researchers at the prestigious AI lab Anthropic, aims to solve these issues by utilizing the fact that AI systems are now able to comprehend natural language. The concept is very straightforward. You start by creating a “constitution” that outlines the principles you want your AI to uphold. Subsequently, the AI is trained to grade responses according to how closely they adhere to the constitution. The model is then given incentives to produce responses that receive higher scores. Reward learning from AI feedback has replaced reinforcement learning from human feedback. The Anthropic researchers stated that “these methods make it possible to control AI behavior more precisely and with far fewer human labels.” Anthropic’s 2023 response to ChatGPT, Claude, was aligned using constitutional AI. (Among the investors in Anthropic is Salesforce, whose CEO and co-chair of TIME is Marc Benioff.)

“With constitutional AI, you’re explicitly writing down the normative premises with which your model should approach the world,” Jack Clark, Anthropic’s head of policy, told TIME in August. “Then the model is training on that.” There are still problems, like the difficulty of making sure the AI has understood both the letter and the spirit of the rules, (“you’re stacking your chips on a big, opaque AI model,” Clark says,) but the technique is a promising addition to a field where new alignment strategies are few and far between.

Naturally, Constitutional AI does not address the issue of whose values AI ought to be in line with. However, Anthropic is attempting to make that decision more accessible to all. The lab conducted an experiment in October wherein it asked a representative sample of one thousand Americans to assist in selecting rules for a chatbot. The results showed that, despite some polarization, it was still possible to draft a functional constitution based on statements that the group reached a consensus on. These kinds of experiments may pave the way for a time when the general public has far more influence over AI policy than it does now, when regulations are set by a select group of Silicon Valley executives.

Text to Video

The rapidly increasing popularity of text-to-video tools is one obvious result of the billions of dollars that have been invested in AI this year. Text-to-image technologies had just begun to take shape a year ago; today, a number of businesses are able to convert sentences into moving pictures with ever-increasing precision.

One of those businesses is Runway, an AI video startup with offices in Brooklyn that aims to enable anyone to make movies. With its most recent model, Gen-2, users can perform video-to-video editing—that is, altering an already-existing video’s style in response to a text prompt, such as transforming a picture of cereal boxes on a tabletop into a nighttime cityscape.

“Our mission is to build tools for human creativity,” Runway’s CEO Cristobal Valenzuela told TIME in May. He acknowledges that this will have an impact on jobs in the creative industries, where AI tools are quickly making some forms of technical expertise obsolete, but he believes the world on the other side is worth the upheaval. “Our vision is a world where human creativity gets amplified and enhanced, and it’s less about the craft, and the budget, and the technical specifications and knowledge that you have, and more about your ideas.” (Investors in Runway include Salesforce, where TIME co-chair and owner Marc Benioff is CEO.)

Pika AI, another startup in the text-to-video space, claims to be producing millions of new videos every week. The startup, which is headed by two Stanford dropouts, debuted in April but has already raised money valued at between $200 and $300 million, according to Forbes. Free tools like Pika, aimed more at the average user than at professional filmmakers, are attempting to change the face of user-generated content. Though text-to-video tools are computationally expensive, don’t be shocked if they start charging for access once the venture capital runs out. That could happen as soon as 2024.

Up Next

According to Apple research, your iPhone may soon include some amazing AI technology

Don't Miss

How Man-made consciousness Can Customize Instruction

Komal

Technology

Google’s Isomorphic Labs Unveils AlphaFold 3, AI that Predicts Structures of Life’s Molecules

Published

1 day ago

May 11, 2024

Kajal Chavan

The Google and DeepMind subsidiary Isomorphic Labs has created a new artificial intelligence model that is purportedly more accurate than existing methods at predicting the configurations and interactions of every molecule in life.

The AlphaFold 3 system, according to co-founder of DeepMind Demis Hassabis, “can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy including proteins, DNA, and RNA.”

Protein interactions are essential for drug discovery and development. Examples of these interactions include those between enzymes that are essential for human metabolism and antibodies that fight infectious illnesses.

Published on May 8 in the academic journal Nature, DeepMind said that the findings might drastically cut down on the time and expense needed to create medicines that have the potential to save lives.

“We can design a molecule that will bind to a specific place on a protein, and we can predict how strongly it will bind,” Hassabis stated in a press release, utilizing these new powers.

Earlier, AlphaFold revolutionized research by making protein 3D structure prediction more straightforward. Nevertheless, prior to AlphaFold 3’s improvement, it was unable to forecast situations in which a protein bound with another molecule.

Despite being limited to non-commercial use, scientists are reportedly excited about its increased predictive power and ability to speed up the drug discovery process.

“AlphaFold 3 allows us to generate very precise structural predictions in a matter of seconds, according to a statement released by Isomorphic Labs on X.”

“This discovery opens up exciting possibilities for drug discovery, allowing us to rationally develop therapeutics against targets that were previously difficult or deemed intractable to modulate,” the blog post continued.

The AlphaFold Server Login Process

The AlphaFold Server, a recently released research tool, will be available to scientists for free, according to a statement made by Google DeepMind and Isomorphic Labs.

Isomorphic Labs is apparently collaborating with pharmaceutical companies to use the potential of AlphaFold 3 in drug design. The goal is to tackle practical drug design issues and ultimately create novel, game-changing medicines for patients.

Since 2021, a database containing more than 200 million protein structures has made AlphaFold’s predictions freely available to non-commercial researchers. In academic works, this resource has been mentioned thousands of times.

According to DeepMind, researchers may now conduct experiments with just a few clicks thanks to the new server’s simplified workflow.

Using a FASTA file, AlphaFold Server’s web interface will enable data entry for a variety of biological molecule types. After processing the task, the AI model displays a 3D overview of the structure.

Technology

Phone.com Launches AI-Connect, a Cutting-Edge Conversational AI Service

Published

2 days ago

May 11, 2024

Kajal Chavan

AI-Connect, a revolutionary conversational speech artificial intelligence (AI) service, was unveiled by Phone.com today. AI-Connect, the newest development in Phone.com’s commercial phone system, offers callers and businesses a smooth and effective contact experience.

AI-Connect is specifically designed to handle inbound leads and schedule appointments without the clumsiness of cookie-cutter call routing or the expense of a contact center. This is ideal for small and micro businesses that need to take advantage of every opportunity to convert interest into sales but lack the luxury of an administrative team or a call center to handle the influx of prospects or sales calls.

AI-Connect can effectively manage duties like call routing, schedule management, and FAQ responding since it is built to engage in genuine, free-flowing conversations with callers. Modern automatic voice recognition (ASR), large language model (LLM), text-to-speech (TTS), natural language understanding (NLU), and natural language processing (NLP) technologies are used to enable this capacity.

The real differentiator with AI-Connect is its capacity to provide goal-oriented, conversational communication. Excellent intent recognition is provided by the company’s creative use of LLM in conjunction with NLU/NLP hybrid infrastructure. Notable is also how the new service leverages machine learning to deliver customized suggestions and detailed call metrics for every engagement.

Phone.com CEO and Co-Founder Ari Rabban stated, “AI-Connect is much more than just a service or new iteration of AI-enabled CX; it’s a strategic game-changer that strips away the burden of expensive, complicated technology designed for small businesses.” “AI-Connect, a component of our UCaaS platform, dismantles conventional barriers and gives companies of all sizes access to a realm of efficiency and expertise that would normally require significant time and investment.”

A professional voice greets customers and provides them with a number of easy options when they initiate a call to an AI-Connect script. AI-Connect guarantees that Phone.com customers maximize every engagement, regardless of their availability to answer, from easily arranging, rescheduling, or canceling appointments to smoothly connecting with a specific contact or department.

AI-Connect effectively filters out spam and other undesirable calls by utilizing sophisticated call screening capabilities, saving both business owners and callers important time.

The discussion between callers and AI-Connect is facilitated by sophisticated conversational design, which also optimizes call flow and delivers real-time responses that are most effective. Businesses may easily modify and implement AI-Connect to meet their specific needs thanks to the intuitive user interface (UI).

“We look forward to embarking on the next chapter of communications with great anticipation as innovation is in our DNA,” said Alon Cohen, the acclaimed Chief Technology Officer of Phone.com, whose engineering prowess produced the first VoIP call ever. The FCC’s Pulver Order, which removed certain IP-based communication services from conventional regulatory restrictions, ushered in a new age and was implemented 20 years ago. With AI-assisted interactions, “we are now in a position to investigate their transformational potential. Our commitment to transforming communication is reaffirmed as we embark on a journey towards a future characterized by intelligent solutions.”

Phone.com is celebrating 15 years of consecutive year-over-year growth, driven by a strong clientele that includes more than 50,000 enterprises and an impressive increase in market share. Supported by an unwavering dedication to providing state-of-the-art services and technology at reasonable costs, the company’s approach works well for enterprises of all sizes, accelerating its trajectory of steady expansion.

Technology

Biosense Webster Unveils AI-Driven Heart Mapping Technology

Published

3 days ago

May 9, 2024

Kajal Chavan

Today, Biosense Webster, a division of Johnson & Johnson MedTech, announced the release of the most recent iteration of its Carto 3 cardiac mapping system.

Heart mapping in three dimensions is available for cardiac ablation procedures with Carto 3 Version 8. It is integrated by Biosense Webster into technology such as the FDA-reviewed Varipulse pulsed field ablation (PFA) system.

Carto Elevate and CartoSound FAM are two new modules that Biosense Webster added to the software. These modules were created by the company to be accurate, efficient, and repeatable when used in catheter ablation procedures for arrhythmias such as AFib.

Biosense Webster’s CartoSound FAM encompasses the first application of artificial intelligence in intracardiac ultrasound. In addition to saving time, the algorithm, according to the company, provides a highly accurate map by automatically generating the left atrial anatomy prior to the catheter being inserted into the left atrium. Through the use of deep learning technology, the module produces 3D shells automatically.

Incorporating multipolar capabilities with the Optrell mapping catheter is one of the new features of the Carto Elevate module. By doing so, far-field potentials are greatly reduced and a more precise activation map for localized unipolar signals is produced. The identification of crucial areas of interest is done effectively and consistently with Elevate’s complex signals identification. An improved Confidense module generates optimal maps, and pattern acquisition automatically monitors arrhythmia burden prior to and following ablation.

Jasmina Brooks, president of Biosense Webster, stated, “We are happy to announce this new version of our Carto 3 system, which reflects our continued focus on harnessing the latest science and technology to advance tools for electrophysiologists to treat cardiac arrhythmias.” For over a decade, the Carto 3 system has served as the mainstay of catheter ablation procedures, assisting electrophysiologists in their decision-making regarding patient care. With the use of ultrasound technology, better substrate characterization, and improved signal analysis, this new version improves the mapping and ablation experience of Carto 3.