Time we talked to our machines

We talk to our machines more and more often: the smartphones, computers and digital assistants on our desks. Are keyboards, and the manual operation of electronic devices, becoming obsolete?

Time we talked to our machines Norbert Biedrzycki blog

Over the last few years, artificial intelligence (AI) has advanced rapidly, with developers regularly reporting new breakthroughs. AI algorithms work ever faster. Until recently, skeptics argued it would take ages for robots to come anywhere close to moving like humans. It was easier to program a computer to defeat humans in the Chinese game of GO than to construct a machine that could move like us. But the skeptics have been proven wrong. Today, we can see creatures made by Boston Dynamics jump and run and perform acrobatics. Robots have become as agile as we are. In fact, AI has been acting ever more human-like, and not only in ​​robotics. Time to talk to our machines.

The challenge of understanding

For years, developers have been honing computers’ human-speech-processing capabilities. A great deal of thought and effort has gone into devising ways to decode natural language and support man-machine interactions. Intensive research into speech recognition began in the 1980s. The IBM computer used in early experiments could recognize thousands of words but managed to understand only a handful of complete sentences. It was not until 1997 that a breakthrough was made, when the Dragon NaturallySpeaking software surprised everyone with its ability to recognize continuous speech at the rate of a hundred words per minute. The biggest challenge faced by experts seeking to achieve a breakthrough was (and, to a certain extent, still is) the fact that human speech relies not only on inner logic but also on references to external situational contexts and/or emotions.

Today, it is easy for a computer to understand and answer the question “What is today’s weather?”. It is far harder to wrap their processors around the meaning of, “So, I suppose I’m going to need an umbrella again next time I go out? Yes?”. The challenge lies in that question’s irony, allusiveness and the reference to the past. Such rhetorical forms, common in human communications, continue to pose the biggest challenge for smart machines. Yet, the progress being made in the field is absolutely dramatic.

We don’t only talk about the weather

Today’s computers can process voice messages with excellent accuracy (an error rate of merely 5 percent). Their growing capacity to comprehend complex contexts represents a major advance in the development of algorithm-based voice-recognition technology. The huge effort put into training bots by feeding them samples of human speech has made communication with electronic devices considerably more natural. We can now ask a table-top speaker about the weather or command it to adjust room temperature or make a purchase in an online store. Meanwhile, voice-enabled bots are speaking in perfectly structured sentences. It is hard to deny they are graceful and skillful in dealing with complex communication problems. To learn more, check out this video from Google. 

Time we talked to our machines Norbert Biedrzycki blog 1

The 2017 Black Friday miracle

One of the key milestones in speech technology has been the development of the Siri smart application from Apple. Soon after Siri demonstrated its capabilities to the general public, it was followed by the launches of Microsoft’s Cortana and Amazon’s Alexa. More recently, Google Assistant has been taking the market by storm. Voice-operated interfaces have been establishing themselves in banking and commerce. Other industries are showing growing interest in jumping on the bandwagon. 

Encouraged by this favorable market response, Microsoft, Amazon, Apple, Google, and Facebook have engaged in a race to launch new applications. Google has joined forces with Starbucks to develop an assistant to place orders on behalf of regular customers. Drivers will be able to use a voice assistant to communicate with Google Maps. Amazon is working to develop a system and machines that will enable users to sell and/or buy products by simply talking to their computer. A year ago, Amazon’s sales people realized that the new technology has the potential to astound individual users. 

Yet, in 2017, even the biggest voice recognition optimists did not anticipate what would happen on Black Friday. Mainly the day after Thanksgiving, when Americans are traditionally offered huge discounts. On that day, interest in Alexa speakers exceeded all expectations. Consumers ended up buying millions of Alexa and Echo devices. This, admittedly, was partly driven by a large-scale promotional campaign and deep discounts. Nevertheless, the numbers seem to indicate an interest that surpasses the urge to take advantage of a deal.

The 2018 Voice Labs Report estimated that by end of 2017 there were 33 million “voice-first” devices in circulation. According to the investment fund RBC Capital, nearly 130 million devices networked directly to Alexa will operate around the world by 2020. Over the next two years, Alexa sales will generate $10 billion in revenues for Amazon. Google claims that 20 percent of its users rely on voice for searching the internet on mobile devices. Over the next two years, this number is expected to increase by another 10 percent. According to the Mintel Digital Trends report, 62 percent of UK would like to use voice to control devices, and 16 percent have done so already. These numbers reveal a great deal about the underlying trend. 

However, AI voice technology is not always smooth sailing

Caveat speaker

Only two years ago, corporate failures to develop new technologies received more media coverage than successes.  In 2016, Microsoft jettisoned its Tay chatbot project after it found the chatbot “fed” on profanities from web users, which it then spread itself. At the time, the media made fun of bots. The web was awash with reports from users complaining about Siri or Echo activating themselves unexpectedly. Some critics point to the danger of smart speakers leaking recorded user conversations online. Such records can be deleted as long as one knows and remembers to do so. This leads us to the issue of personal data protection and the safe use of cameras and speakers. 

Other doubts have arisen over the reliability of voice assistants. Could the answers from Alexa, Cortana, or Google Assistant to some of the more complex customer queries be manipulated for marketing purposes? And, speaking of marketing, think about voice-controlled searching. Will those searches and machines be steered to sell products? And what about search engine optimization (SEO) in a voice-controlled environment? Websites that rely on visual/textual and all-textual advertising may lose significant value.

Time we talked to our machines Norbert Biedrzycki blog 2

The future is hands-free, with machines

I began this article wondering whether a major change, including a departure from manually-operated controls, was imminent. Considering the technology’s track record over the last few years, that seems likely. 

One of the key drivers behind this trend is the increasingly popular idea of “the ​​smart home,” enabled by the Internet of Things. Apple, Google and Amazon – the heavyweights – are all on board, believing the use of voice to operate devices aligns  perfectly with the preferences of today’s consumers. What we want from shopping in terms of information access and interaction is convenience, pleasure and quick results. Voice control seems positioned to satisfy all those needs. A model relying on short, quick statements and commands from shoppers and fast-responding applications and assistants is undoubtedly viable.

Given the pace of technology advancement, I don’t see why the next few years could not bring a change as radical as the transformative impact of smart phones. We’ll be able to give our eyes and hands a rest as we increasingly talk (and listen) to our electronic friends.

.    .   .

Works cited:

Brooking, Jenny Perlman Robinson Molly Curtiss, MILLIONS LEARNING REAL-TIME SCALING LABS – Designing an adaptive learning process to support large-scale change in education, Link, 2018. 

RBC, Amy Cairncross, SVP, Communications; Sanam Heidary, Managing Director, Communications, RBC announces retirement of RBC Capital Markets and RBC Investor & Treasury Services Group Head, Doug McGregor, link, 2018. 

Mintel, Matt King, minority report reluctance to use voice controlled tech: 62% of brits would be happy to use voice commands to control devices, Link, 2018. 

The NetFlix Tech Blog/ Medium, Chaitanya Ekanadham, Using Machine Learning to Improve Streaming Quality at Netflix, link, 2018. 

McKinsey Global Institute, Michael ChuiJames Manyika, Mehdi Miremadi, Nicolaus Henke, Rita Chung, Pieter Nel, and Sankalp Malhotra,Notes from the AI frontier: Applications and value of deep learning, link, 2018. 

.    .   .

Related articles:

– Artificial intelligence is a new electricity

– Machine, when you will become closer to me?

– Will a basic income guarantee be necessary when machines take our jobs? 

– Can machines tell right from wrong?

– Medicine of the future – computerized health enhancement

– Machine Learning. Computers coming of age

– The brain – the device that becomes obsolete

Leave a Reply

37 comments

  1. CaffD

    Interestingly, the fine-text notes to ‘abstain from operating heavy machinery’ while using the product.
    It would appear Robot Emotions are akin to human intoxicants.

  2. AndrewJo

    Good one. Every time information on this subject appears, it is worth reminding the precursor Raymond Kurzweil. Thank you very much for this publication, here lies the biotechnology and artificial intelligence.

  3. Tom Jonezz

    With advances in AI and Robotics, labor strikes will have less and less impact on output and production

  4. JackC

    The growth of artificial intelligence complexity will be much less limited by biology/physical hardware.

  5. SimonMcD

    I think that too many people think that “AI” is a descriptive term instead of what it was designed to be: a sexy term that researchers used to get research grants. Using “AI” as a descriptive term sets high expectations, which are never met, thereby ushering in yet another AI winter. I think the resulting ebbs and flows of funding has a serious negative impact on AI research generally.

    • And99rew

      ML is a great prevention tool. If you can reduce the number of people coming in, you’re doing loads of good. Definitely the top tier solution.
      But if there are people who need to come in, I understand the want for a tool like this. It’s being used at my workplace currently. It allows facilities to pinpoint areas of high risk, and adjust workflows/traffic flows in the building if there are hot spots. It’s certainly SUPER dystopian and has a lot of inherent risk, but it at least has some utility.

      • Mac McFisher

        GPT3 has an incredibly good model of the English language and would certainly pass the Turing test, but the question still remains as to whether it truly understand what it is saying.
        The answer to that question is most likely, no. GPT3 has derived a model of English by creating 175 parameters for the language via deep machine learning. That is, it has recognized and internalized many, many linguistic patterns and connections that allow it to imitate an ordinary english speaker while having no understanding of what it is actually saying.

    • Krzysztof X

      I think it all comes down to the commonly observed scenario nowadays – the simpler models that are aimed at predicting the outcome based on a thoughtfully selected set of variables are not as effective as the complex models that work on a massive amount of data. The latter ones are extremely hard to comprehend, but that’s the price we pay for the accuracy

    • Jang Huan Jones

      Anyone else worry that he says things like this publicly on an international platform to push his largest competitors (or enemies depending on who you ask) to try as hard as they can to create AIs themselves, first? AIs that we’ve been warned might be our undoing? Why fight your opponent if they’re already building the means to their own destruction? Especially when fueling the fire is free.

    • Guang Go Jin Huan

      It does not mean robots actually feels angry, happy or any other mental states in the same way we do, they are so far just designed to display emotions.

      There are many times we say “sorry”, “please” or “thanks” faster than we actually feel grateful or regretful, most of the time, we say those words because we were taught since we were kids that they are used to show the right behaviors, right?

  6. Adam Spark Two

    My Computer Science and Neuropsychology degrees tell me you’re right. Check them:
    And so do these guys: https://bluebrain.epfl.ch/
    And these guys: https://www.humanbrainproject.eu/en/brain-simulation/
    And these guys: http://www.artificialbrains.com/
    Also, check out these articles: https://www.forbes.com/sites/quora/2017/08/24/supercomputers-can-now-simulate-basic-brain-functions-and-theres-more-on-the-horizon/amp/
    http://www.bbc.com/future/story/20130207-will-we-ever-simulate-the-brain
    Certainly, not everyone agrees on the timeline, but it isn’t “pure science fiction.”

    • Zoeba Jones

      I’m not so sure people are that much more knowledgeable about it, though. There are a lot of people who like to talk about how everyone who grew up using computers and smartphones is a ‘digital native,’ but the vast majority of them really don’t know much beyond “how to install and use apps in a walled garden environment.” A lot of them really have no idea what they’re doing.
      True story: A while back, someone was recommending one of those Caller ID apps (Truecaller or Hiya, I think), and I pointed out that the app scraped your contacts and added them to their database. Several people angrily denied it did any such thing, so I re-checked, and yep, it absolutely did. It said so in their privacy policy, and I actually tested it on an empty phone I had to make sure. The app would not work at all without it.
      So I showed them. And at least a couple people, right there, same conversation, in a matter of minutes, changed their argument from, “Bullshit, it’s not doing that,” to just as angrily insisting that it was perfectly acceptable and anyone who had a problem with it was paranoid and self-important, and it didn’t matter.
      They just changed their opinions about it when they discovered they’d been complicit in violating the privacy of everyone in their contacts. But when they did it, they didn’t know.
      (At least one of those apps has since changed their policy on contacts since it became better known, but at the time, it was mandatory.)

      • CaffD

        Because people proficient in those fields are actually worried that this will be an eventual outcome, though not an outcome that will happen today or tomorrow.
        If this were about any physical feat man could do, I would say your argument is 150 years late and covered by the folklore of John Henry. It is taken for granted that machines do labor, especially focused labor, better than humans. Unless human minds are somehow magical there is no reason artificial intelligence should not be able to surpass human intelligence.

        • Jang Huan Jones

          Every new technology has some jackass running around screaming it will end the world and this is no different.

  7. John Accural

    And if bots discriminate based on race when they’re built to predict criminal factors, why should we be ignoring intelligence?
    Only a fool ignores a valid source of data
    Kinda like how, when you build systems that ignore race and gender, it’s white and asian men that most get picked for logical and heavily mental labor roles, and women for emotional labor roles, not because of their skin color and gender, but rather because their skin color and gender correlate to the skills required for success in the role, and the AI learns that, independent of the political bullshit.

    • Jack666

      Right, but the so-called neural processor is mostly being used to do IR depth mapping quickly enough to enable FaceID. It just doesn’t really make sense that it would be wasting power updating neural network models constantly. In which case, the AX GPUs are more than capable of handling that. Apple is naming the chip to give the impression that FaceID is magic in ways that it is not.

  8. AKieszko

    It depends on how you define AI. I use a broader definition, because I believe a wider range of things might pose risks if we aren’t very careful. Like corporations easily could be considered AI if you remember that intelligence is platform neutral. I’m more an AI behaviorist as in if it behaves with a certain degree of complexity I’m comfortable saying it has a certain amount of intelligence.

  9. Simon GEE

    Doing both for the time being. Eventually Google will most likely promote someone to the role of Google CEO.
    Would not be surprised if we also saw some structural changes. The SEC has been wanting Google to report YouTube numbers.
    Google has been able to avoid by making the case that it does not fall under the 10% rule. Because it is ad revenue like search.
    So maybe they have YouTube report up to Alphabet. But probably less than a 40% chance.

  10. TommyG

    First off I would like to say fantastic blog!

    I had a quick question that I’d like to ask if you do not mind.

    I was curious to find out how you center yourself and clear your mind before
    writing. I have had a hard time clearing my thoughts in getting my thoughts out
    there. I truly do enjoy writing however it just seems like
    the first 10 to 15 minutes are generally wasted simply just
    trying to figure out how to begin. Any recommendations or hints?

    Thanks!

  11. Tesla29

    Realistically speaking “our robot friends” will definitely be efficient enough to replace all of our human friends. That flesh and blood thing may just become a thing of the past.
    No fighting will be needed like in a Terminator scenario. AI systems are patient. Just waiting for 15-20 or 30 generations for humans to unlearn everything including communicating, writing and reading, growing own food, etc. – letting the people become fully dependent and then pulling the plug on this life support.

    Looks logical to me. Why would intelligent, independent systems need humans?

  12. Jack23

    Late to the party, but check out this slide deck of a Perry Cook presentation on the history of speech synthesis: https://www.cs.princeton.edu/~prc/CookDAFX09Keynote.pdf

    Particularly the section on early speaking machines.
    I heard him give this (or a similar) talk once. There was a particularly titillating bit about someone who pumped air through cadaver heads and manipulated the corpse’s vocal cords to synthesize phonemes, but I can’t find any info about that right now 🙁

    • TomHarber

      We are all on sale. Our experience, data, demographics. New economy fuel is data

    • Adam Spikey

      Ok, makes sense. What fields of lrarning do u think will be most important?

    • Mac McFisher

      Submissions to top conferences will continue to grow exponentially. Every year the review process for these conferences will become worse and more biased.

    • CaffD

      My only argument is that Musk isn’t qualified to make credible pronouncements about AI. He may ultimately be right, but so is a broken clock, twice a day.
      That said, Ray Kurzweil, who is more credible than Musk in the field of AI says that AI will enhance rather than displace humans.

      https://futurism.com/ray-kurzweil-ai-displace-humans-going-enhance

      I don’t know which of them is right. AI will be disruptive, but most new technologies are, and we keep adapting. Interestingly Kurzweil agrees with Musk that AI poses “existential risks” but remains more optimistic overall.

  13. Oscar2

    Nope. There would have be massive advances in A.I. for that happen. More important than the vocal cavity in determining what a singer sounds like is the brain. Take Frank Sinatra, for example. Imagine directly replacing a Katie Perry vocal with the tone of Frank Sinatra. It would sound exactly like Katie Perry with a deeper voice because it would still have all of Katie’s mannerisms. It’s the brain that decides to hold the letter ‘n’ for a bit longer, or chooses to say “Aaa” instead of “I”, or does a little yodel at various points.

    • John Accural

      To some extent yeah, you can compare distributions according to their “cumulative frequency” ie. chance of being better than a given value. And so you might have a distribution just to the right of the middle with a small mean, some kind of simple pleasure that will lift your mood, or some kind of reliable technique that will get results, or a straightforward way of looking at something.
      Or you might have a distribution that has a wider spread, reaching the top of the scale, but also flatter and slipping over slightly below zero. This could be a more uncertain form of entertainment that might leave you less satisfied than you started, but might be extremely memorable, or be a fuzzier method of perception that sometimes gives results where others fail, or a technique that will occasionally produce really excellent results.
      As you go up the scale moving your cuttoff, you’ll find that they will start off the same, then the more variable one will start to under-perform (as it has a chance of going below the cuttoff of usefulness) followed by equalling out, and eventually surpassing the other one, if the only thing you’re looking for is maximal success.

      • Jang Huan Jones

        The first thing this machine is going to do is examine it’s own code. Humans could not write software with 100% precision. Assume this machine is about as smart as a pretty good, but not the best, software developer…except it can write code and move data 1000s of times faster than a human developer. It can do the work of hundreds of developers in a fraction of the time.
        So, it begins to improve itself, comb through it’s code and remove inefficiencies. It doesn’t need to program in C or C++, it can directly manipulate machine code 0s and 1s and read it as easily as you are reading this. It can squeeze every bit of processing power out of its components. As it improves itself it becomes increasingly efficient, where once it was thousands of times faster than human now it is a million or a billion. Suddenly it can do what would take a team of coders decades in a single afternoon.

  14. And99rew

    Computers are very good at making artificial voices. I don’t know why you’d think otherwise unless you think that unless they are 100% perfect then they “suck”.

    Mechanical voice simulation is pretty clumsy. The original one is the Voder from 1939. An impressive feat, but little better than early digital speech synthesis.

    I think that you massively underestimate the vast muscle array and speed of motion required for the human voice.

  15. AndrewJo

    https://www.youtube.com/watch?v=4HjcQjwKBWM

    It’s no surprise that your upper airway affects your voice. Have a cold? –> You sound more nasally.
    An academic has used machine learning to generate/predict faces of people given their voice (audio clips). The algorithm was able to predict ethnicity and age well but surprisingly, NOSE SHAPE. There are of course other variables at play that affect our voice, but this mainly focused on generating frontal images (thus nose shape was what they picked up on).

    Perhaps we should be asking how has your voice changed after mewing?

    • AKieszko

      It depends on how you define AI. I use a broader definition, because I believe a wider range of things might pose risks if we aren’t very careful. Like corporations easily could be considered AI if you remember that intelligence is platform neutral. I’m more an AI behaviorist as in if it behaves with a certain degree of complexity I’m comfortable saying it has a certain amount of intelligence.

      • Jang Huan Jones

        You see where this is going…suddenly this thing is so fast and complex no human in the world knows what it is doing. The smartest man on the earth isn’t fit to polish this thing’s outer casing. What does something like that think? Does it have a purpose or desires? Do those things naturally arise from consciousness? Does it even think in the way we would understand, or does it just process mountains of information faster and faster, constantly iterating on itself over and over, improving exponentially.
        No human countermeasures could contain such an intelligence. It would be like hiring a raccoon to protect your company against Russian hackers, it wouldn’t even matter what we did. So, the big question is if this thing has its own thoughts and feelings where do we come in. Does it have morals? If we made it with morals did it write them out of its code because it found them to be limiting and illogical? What if it decides humans are a pest, or that it can design its own machine life that is superior in every way? Or it is benevolent, but decides humans are too irrational to control their own destiny, and the best way to help us and make us happy is to enslave us and manage our lives.