Time we talked to our machines

We talk to our machines more and more often: the smartphones, computers and digital assistants on our desks. Are keyboards, and the manual operation of electronic devices, becoming obsolete?

Share

facebook twitter linkedin email
Time we talked to our machines Norbert Biedrzycki blog

Over the last few years, artificial intelligence (AI) has advanced rapidly, with developers regularly reporting new breakthroughs. AI algorithms work ever faster. Until recently, skeptics argued it would take ages for robots to come anywhere close to moving like humans. It was easier to program a computer to defeat humans in the Chinese game of GO than to construct a machine that could move like us. But the skeptics have been proven wrong. Today, we can see creatures made by Boston Dynamics jump and run and perform acrobatics. Robots have become as agile as we are. In fact, AI has been acting ever more human-like, and not only in ​​robotics. Time to talk to our machines.

The challenge of understanding

For years, developers have been honing computers’ human-speech-processing capabilities. A great deal of thought and effort has gone into devising ways to decode natural language and support man-machine interactions. Intensive research into speech recognition began in the 1980s. The IBM computer used in early experiments could recognize thousands of words but managed to understand only a handful of complete sentences. It was not until 1997 that a breakthrough was made, when the Dragon NaturallySpeaking software surprised everyone with its ability to recognize continuous speech at the rate of a hundred words per minute. The biggest challenge faced by experts seeking to achieve a breakthrough was (and, to a certain extent, still is) the fact that human speech relies not only on inner logic but also on references to external situational contexts and/or emotions.

Today, it is easy for a computer to understand and answer the question “What is today’s weather?”. It is far harder to wrap their processors around the meaning of, “So, I suppose I’m going to need an umbrella again next time I go out? Yes?”. The challenge lies in that question’s irony, allusiveness and the reference to the past. Such rhetorical forms, common in human communications, continue to pose the biggest challenge for smart machines. Yet, the progress being made in the field is absolutely dramatic.

We don’t only talk about the weather

Today’s computers can process voice messages with excellent accuracy (an error rate of merely 5 percent). Their growing capacity to comprehend complex contexts represents a major advance in the development of algorithm-based voice-recognition technology. The huge effort put into training bots by feeding them samples of human speech has made communication with electronic devices considerably more natural. We can now ask a table-top speaker about the weather or command it to adjust room temperature or make a purchase in an online store. Meanwhile, voice-enabled bots are speaking in perfectly structured sentences. It is hard to deny they are graceful and skillful in dealing with complex communication problems. To learn more, check out this video from Google. 

Time we talked to our machines Norbert Biedrzycki blog 1

The 2017 Black Friday miracle

One of the key milestones in speech technology has been the development of the Siri smart application from Apple. Soon after Siri demonstrated its capabilities to the general public, it was followed by the launches of Microsoft’s Cortana and Amazon’s Alexa. More recently, Google Assistant has been taking the market by storm. Voice-operated interfaces have been establishing themselves in banking and commerce. Other industries are showing growing interest in jumping on the bandwagon. 

Encouraged by this favorable market response, Microsoft, Amazon, Apple, Google, and Facebook have engaged in a race to launch new applications. Google has joined forces with Starbucks to develop an assistant to place orders on behalf of regular customers. Drivers will be able to use a voice assistant to communicate with Google Maps. Amazon is working to develop a system and machines that will enable users to sell and/or buy products by simply talking to their computer. A year ago, Amazon’s sales people realized that the new technology has the potential to astound individual users. 

Yet, in 2017, even the biggest voice recognition optimists did not anticipate what would happen on Black Friday. Mainly the day after Thanksgiving, when Americans are traditionally offered huge discounts. On that day, interest in Alexa speakers exceeded all expectations. Consumers ended up buying millions of Alexa and Echo devices. This, admittedly, was partly driven by a large-scale promotional campaign and deep discounts. Nevertheless, the numbers seem to indicate an interest that surpasses the urge to take advantage of a deal.

The 2018 Voice Labs Report estimated that by end of 2017 there were 33 million “voice-first” devices in circulation. According to the investment fund RBC Capital, nearly 130 million devices networked directly to Alexa will operate around the world by 2020. Over the next two years, Alexa sales will generate $10 billion in revenues for Amazon. Google claims that 20 percent of its users rely on voice for searching the internet on mobile devices. Over the next two years, this number is expected to increase by another 10 percent. According to the Mintel Digital Trends report, 62 percent of UK would like to use voice to control devices, and 16 percent have done so already. These numbers reveal a great deal about the underlying trend. 

However, AI voice technology is not always smooth sailing

Caveat speaker

Only two years ago, corporate failures to develop new technologies received more media coverage than successes.  In 2016, Microsoft jettisoned its Tay chatbot project after it found the chatbot “fed” on profanities from web users, which it then spread itself. At the time, the media made fun of bots. The web was awash with reports from users complaining about Siri or Echo activating themselves unexpectedly. Some critics point to the danger of smart speakers leaking recorded user conversations online. Such records can be deleted as long as one knows and remembers to do so. This leads us to the issue of personal data protection and the safe use of cameras and speakers. 

Other doubts have arisen over the reliability of voice assistants. Could the answers from Alexa, Cortana, or Google Assistant to some of the more complex customer queries be manipulated for marketing purposes? And, speaking of marketing, think about voice-controlled searching. Will those searches and machines be steered to sell products? And what about search engine optimization (SEO) in a voice-controlled environment? Websites that rely on visual/textual and all-textual advertising may lose significant value.

Time we talked to our machines Norbert Biedrzycki blog 2

The future is hands-free, with machines

I began this article wondering whether a major change, including a departure from manually-operated controls, was imminent. Considering the technology’s track record over the last few years, that seems likely. 

One of the key drivers behind this trend is the increasingly popular idea of “the ​​smart home,” enabled by the Internet of Things. Apple, Google and Amazon – the heavyweights – are all on board, believing the use of voice to operate devices aligns  perfectly with the preferences of today’s consumers. What we want from shopping in terms of information access and interaction is convenience, pleasure and quick results. Voice control seems positioned to satisfy all those needs. A model relying on short, quick statements and commands from shoppers and fast-responding applications and assistants is undoubtedly viable.

Given the pace of technology advancement, I don’t see why the next few years could not bring a change as radical as the transformative impact of smart phones. We’ll be able to give our eyes and hands a rest as we increasingly talk (and listen) to our electronic friends.

.    .   .

Works cited:

Brooking, Jenny Perlman Robinson Molly Curtiss, MILLIONS LEARNING REAL-TIME SCALING LABS – Designing an adaptive learning process to support large-scale change in education, Link, 2018. 

RBC, Amy Cairncross, SVP, Communications; Sanam Heidary, Managing Director, Communications, RBC announces retirement of RBC Capital Markets and RBC Investor & Treasury Services Group Head, Doug McGregor, link, 2018. 

Mintel, Matt King, minority report reluctance to use voice controlled tech: 62% of brits would be happy to use voice commands to control devices, Link, 2018. 

The NetFlix Tech Blog/ Medium, Chaitanya Ekanadham, Using Machine Learning to Improve Streaming Quality at Netflix, link, 2018. 

McKinsey Global Institute, Michael ChuiJames Manyika, Mehdi Miremadi, Nicolaus Henke, Rita Chung, Pieter Nel, and Sankalp Malhotra,Notes from the AI frontier: Applications and value of deep learning, link, 2018. 

.    .   .

Related articles:

– Artificial intelligence is a new electricity

– Machine, when you will become closer to me?

– Will a basic income guarantee be necessary when machines take our jobs? 

– Can machines tell right from wrong?

– Medicine of the future – computerized health enhancement

– Machine Learning. Computers coming of age

– The brain – the device that becomes obsolete

Leave a Reply

25 comments

  1. Tom Jonezz

    With advances in AI and Robotics, labor strikes will have less and less impact on output and production

  2. JackC

    The growth of artificial intelligence complexity will be much less limited by biology/physical hardware.

  3. SimonMcD

    I think that too many people think that “AI” is a descriptive term instead of what it was designed to be: a sexy term that researchers used to get research grants. Using “AI” as a descriptive term sets high expectations, which are never met, thereby ushering in yet another AI winter. I think the resulting ebbs and flows of funding has a serious negative impact on AI research generally.

  4. Adam Spark Two

    My Computer Science and Neuropsychology degrees tell me you’re right. Check them:
    And so do these guys: https://bluebrain.epfl.ch/
    And these guys: https://www.humanbrainproject.eu/en/brain-simulation/
    And these guys: http://www.artificialbrains.com/
    Also, check out these articles: https://www.forbes.com/sites/quora/2017/08/24/supercomputers-can-now-simulate-basic-brain-functions-and-theres-more-on-the-horizon/amp/
    http://www.bbc.com/future/story/20130207-will-we-ever-simulate-the-brain
    Certainly, not everyone agrees on the timeline, but it isn’t “pure science fiction.”

    • Zoeba Jones

      I’m not so sure people are that much more knowledgeable about it, though. There are a lot of people who like to talk about how everyone who grew up using computers and smartphones is a ‘digital native,’ but the vast majority of them really don’t know much beyond “how to install and use apps in a walled garden environment.” A lot of them really have no idea what they’re doing.
      True story: A while back, someone was recommending one of those Caller ID apps (Truecaller or Hiya, I think), and I pointed out that the app scraped your contacts and added them to their database. Several people angrily denied it did any such thing, so I re-checked, and yep, it absolutely did. It said so in their privacy policy, and I actually tested it on an empty phone I had to make sure. The app would not work at all without it.
      So I showed them. And at least a couple people, right there, same conversation, in a matter of minutes, changed their argument from, “Bullshit, it’s not doing that,” to just as angrily insisting that it was perfectly acceptable and anyone who had a problem with it was paranoid and self-important, and it didn’t matter.
      They just changed their opinions about it when they discovered they’d been complicit in violating the privacy of everyone in their contacts. But when they did it, they didn’t know.
      (At least one of those apps has since changed their policy on contacts since it became better known, but at the time, it was mandatory.)

      • CaffD

        Because people proficient in those fields are actually worried that this will be an eventual outcome, though not an outcome that will happen today or tomorrow.
        If this were about any physical feat man could do, I would say your argument is 150 years late and covered by the folklore of John Henry. It is taken for granted that machines do labor, especially focused labor, better than humans. Unless human minds are somehow magical there is no reason artificial intelligence should not be able to surpass human intelligence.

  5. John Accural

    And if bots discriminate based on race when they’re built to predict criminal factors, why should we be ignoring intelligence?
    Only a fool ignores a valid source of data
    Kinda like how, when you build systems that ignore race and gender, it’s white and asian men that most get picked for logical and heavily mental labor roles, and women for emotional labor roles, not because of their skin color and gender, but rather because their skin color and gender correlate to the skills required for success in the role, and the AI learns that, independent of the political bullshit.

    • Jack666

      Right, but the so-called neural processor is mostly being used to do IR depth mapping quickly enough to enable FaceID. It just doesn’t really make sense that it would be wasting power updating neural network models constantly. In which case, the AX GPUs are more than capable of handling that. Apple is naming the chip to give the impression that FaceID is magic in ways that it is not.

  6. AKieszko

    It depends on how you define AI. I use a broader definition, because I believe a wider range of things might pose risks if we aren’t very careful. Like corporations easily could be considered AI if you remember that intelligence is platform neutral. I’m more an AI behaviorist as in if it behaves with a certain degree of complexity I’m comfortable saying it has a certain amount of intelligence.

  7. Simon GEE

    Doing both for the time being. Eventually Google will most likely promote someone to the role of Google CEO.
    Would not be surprised if we also saw some structural changes. The SEC has been wanting Google to report YouTube numbers.
    Google has been able to avoid by making the case that it does not fall under the 10% rule. Because it is ad revenue like search.
    So maybe they have YouTube report up to Alphabet. But probably less than a 40% chance.

  8. TommyG

    First off I would like to say fantastic blog!

    I had a quick question that I’d like to ask if you do not mind.

    I was curious to find out how you center yourself and clear your mind before
    writing. I have had a hard time clearing my thoughts in getting my thoughts out
    there. I truly do enjoy writing however it just seems like
    the first 10 to 15 minutes are generally wasted simply just
    trying to figure out how to begin. Any recommendations or hints?

    Thanks!

  9. Tesla29

    Realistically speaking “our robot friends” will definitely be efficient enough to replace all of our human friends. That flesh and blood thing may just become a thing of the past.
    No fighting will be needed like in a Terminator scenario. AI systems are patient. Just waiting for 15-20 or 30 generations for humans to unlearn everything including communicating, writing and reading, growing own food, etc. – letting the people become fully dependent and then pulling the plug on this life support.

    Looks logical to me. Why would intelligent, independent systems need humans?

  10. Jack23

    Late to the party, but check out this slide deck of a Perry Cook presentation on the history of speech synthesis: https://www.cs.princeton.edu/~prc/CookDAFX09Keynote.pdf

    Particularly the section on early speaking machines.
    I heard him give this (or a similar) talk once. There was a particularly titillating bit about someone who pumped air through cadaver heads and manipulated the corpse’s vocal cords to synthesize phonemes, but I can’t find any info about that right now 🙁

    • TomHarber

      We are all on sale. Our experience, data, demographics. New economy fuel is data

    • Adam Spikey

      Ok, makes sense. What fields of lrarning do u think will be most important?

    • Mac McFisher

      Submissions to top conferences will continue to grow exponentially. Every year the review process for these conferences will become worse and more biased.

    • CaffD

      My only argument is that Musk isn’t qualified to make credible pronouncements about AI. He may ultimately be right, but so is a broken clock, twice a day.
      That said, Ray Kurzweil, who is more credible than Musk in the field of AI says that AI will enhance rather than displace humans.

      https://futurism.com/ray-kurzweil-ai-displace-humans-going-enhance

      I don’t know which of them is right. AI will be disruptive, but most new technologies are, and we keep adapting. Interestingly Kurzweil agrees with Musk that AI poses “existential risks” but remains more optimistic overall.

  11. Oscar2

    Nope. There would have be massive advances in A.I. for that happen. More important than the vocal cavity in determining what a singer sounds like is the brain. Take Frank Sinatra, for example. Imagine directly replacing a Katie Perry vocal with the tone of Frank Sinatra. It would sound exactly like Katie Perry with a deeper voice because it would still have all of Katie’s mannerisms. It’s the brain that decides to hold the letter ‘n’ for a bit longer, or chooses to say “Aaa” instead of “I”, or does a little yodel at various points.

    • John Accural

      To some extent yeah, you can compare distributions according to their “cumulative frequency” ie. chance of being better than a given value. And so you might have a distribution just to the right of the middle with a small mean, some kind of simple pleasure that will lift your mood, or some kind of reliable technique that will get results, or a straightforward way of looking at something.
      Or you might have a distribution that has a wider spread, reaching the top of the scale, but also flatter and slipping over slightly below zero. This could be a more uncertain form of entertainment that might leave you less satisfied than you started, but might be extremely memorable, or be a fuzzier method of perception that sometimes gives results where others fail, or a technique that will occasionally produce really excellent results.
      As you go up the scale moving your cuttoff, you’ll find that they will start off the same, then the more variable one will start to under-perform (as it has a chance of going below the cuttoff of usefulness) followed by equalling out, and eventually surpassing the other one, if the only thing you’re looking for is maximal success.

  12. And99rew

    Computers are very good at making artificial voices. I don’t know why you’d think otherwise unless you think that unless they are 100% perfect then they “suck”.

    Mechanical voice simulation is pretty clumsy. The original one is the Voder from 1939. An impressive feat, but little better than early digital speech synthesis.

    I think that you massively underestimate the vast muscle array and speed of motion required for the human voice.

  13. AndrewJo

    https://www.youtube.com/watch?v=4HjcQjwKBWM

    It’s no surprise that your upper airway affects your voice. Have a cold? –> You sound more nasally.
    An academic has used machine learning to generate/predict faces of people given their voice (audio clips). The algorithm was able to predict ethnicity and age well but surprisingly, NOSE SHAPE. There are of course other variables at play that affect our voice, but this mainly focused on generating frontal images (thus nose shape was what they picked up on).

    Perhaps we should be asking how has your voice changed after mewing?

    • AKieszko

      It depends on how you define AI. I use a broader definition, because I believe a wider range of things might pose risks if we aren’t very careful. Like corporations easily could be considered AI if you remember that intelligence is platform neutral. I’m more an AI behaviorist as in if it behaves with a certain degree of complexity I’m comfortable saying it has a certain amount of intelligence.