Sorcero Employee Spotlight: Dr. Walid Saba

Dr. Walid Saba is a Natural Language Understanding (NLU) thought leader.

Walid is known for his distinct voice on AI and Natural Language Understanding (NLU). His Medium page has led to invitations to contribute to major online tech magazines, and his piece titled Machine Learning Won't Solve Natural Language Understanding was a runner-up in the awards for most influential and viewed articles in The Gradient.

Prior to joining Sorcero, Walid was a Principal AI Scientist at Astound and the CTO of Klangoo, the developer of the Magnet digital content semantic engine. He has held various positions at places such as the American Institutes for Research, AT&T Bell Labs, IBM Watson, and Cognos.

He has also spent 7 years in academia where he taught computer science at Carleton University, the New Jersey Institute of Technology (NJIT), the University of Windsor, and the American University of Beirut (AUB). He has published over 50 technical articles, including an award-winning paper that he presented at the German Artificial Intelligence Conference (KI-2008), and he holds a Ph.D. in Computer Science from Carleton University.

We recently had the opportunity to interview Senior Scientist Dr. Walid Saba about his Natural Language Understanding (NLU) expertise and experience working in AI at Sorcero. Here is what he had to say.

Christina Cullen (CC): You have an extremely impressive background, and we’re very lucky to have you here at Sorcero. In your own words, how would you describe the focus or highlights of your career?

Dr. Walid Saba (WS): Well, I've been in AI / Natural Language Understanding for a long time. I'm an old cat. I even lived through the first AI Winter, which very few people, today's young people, know about. So I've been an AI for quite some time. For some, for many years, we couldn't do any AI. AI was not a good thing to speak of. Like I said, we went through an AI Winter. So I made a living like a software engineer. After all, I'm a computer scientist, but I never stopped really doing research in AI. It's a passion of mine. So my career was as a software engineer, but I've always been passionate about AI. That's where my grad studies were all in AI.

CC: What originally inspired you to study artificial intelligence?

WS: The term itself attracted me to artificial intelligence. What is that all about? So I was in my first year at the university, I went to engineering first. Then I read an article in some science magazine, probably Scientific American or something, and it was all about artificial intelligence. And I said, What the hell is this? That's interesting. And it talks about the mind and cognition and reasoning and language and logic, and it combines psychology - it combines so many disciplines. You have to get into philosophy and language and linguistics and neuroscience. And I said - that's the one, and I switched to computer science. I've been infected by that bug ever since.

CC: That's amazing. It's really interesting. For those who aren't as familiar, how would you explain natural language understanding?

WS: I've written a lot about that. The difference between natural language understanding and natural language processing, and another term we use is text processing. Processing is when you have text and you can get something out of it, like - what are the key topics in this article? Basically, what is it all about? How can I summarize, can I cluster these documents together, because they're somewhat related? All of that is language processing or text processing. If you want language understanding, it’s another beast. The key word here is understanding.

We use language to express thoughts, right? So the external thought is in language, but internally, we have a mental model of thought. I don't want to get into it because there's no agreement on how it is inside before we express it in language. Understanding is really, ‘I am expressing some thought I encoded in English or Chinese, and I send it to you either written or by voice and you receive that, and you decode that, and hopefully, you get at the thought that I was trying to convey.’

That's the understanding part. So it's a lot more than just text processing, and it's an amazing thing. I mean, you meet a four-year-old girl in a zoo, and you’ve never met. You can chat with her for like an hour, exchanging thoughts and ideas and it involves doing so many things, resolving references, disambiguating words that have many meanings. And like John had pizza with his kids. It means together with his kids, John had pizza with pineapple. The kids are not a topping and the pineapple is not a companion. Now, pineapple is a topic. It's amazing. If you look at the details, what goes on into understanding the thought would convey. So long story short, NLU is really about comprehending the thought you are trying to convey in English or Danish or Chinese or whatever. It's a very, very challenging problem.

CC: I think it helps when you put it that way.

WS: Yeah, and notwithstanding all the there's a lot of hype now anyway, but we don't have anything called language understanding yet. We have a lot of, we made a lot of progress on language processing, looking at it as data and crunching all that texting, getting something out of it. But language understanding, human-level language understanding by machines, there is no machine that can understand human linguistic communication, as of now. Not even close.

CC: You've explained how this has come to be over periods of time, and I think there's a common perception that artificial intelligence and machine learning is all very new. How do you view that?

WS: Oh, no, not new at all. I mean, like I was saying in the beginning, I am an old cat. And I've always been in AI. And it's even older than me by two, three decades. AI started in the 50s, even officially as a discipline. The term was coined in ‘56 by John McCarthy, an AI pioneer, who I admire a lot. But even in 1950, computing pioneer Alan Turing, people probably know him outside of computer science from the Enigma movie. He wrote a paper in 1950. It's a classic, called Computing Machinery and Intelligence, and that was the introduction of AI. But even before computers, AI has always been there, like how do we explain the mind? How do we reason and how do we communicate now? So AI is an old discipline. But even formally, as a discipline in computer science, goes back to the 50s. It's not new at all, even neural networks now speak of machine learning, deep learning.

They were at their peak in the mid-80s. And what we call now deep learning hasn't changed much from the ones in the 80s. All that happens drastically differently as we have a lot more computer power. We couldn't run all this stuff on massive data back in the days when we had little memory and computing power was nothing compared to what we have now. So all this stuff, there's nothing new under the sun. It's pretty old.

CC: So for you, as a major thought leader in the field of artificial intelligence and natural language understanding, what made you decide to join Sorcero?

WS: I had started a startup for two and a half years. I couldn't get support for it and I was getting tired of it because it takes you away from doing real science. Then, you have to deal with all the business stuff. So I got tired of it. I said I'm going to go back and find a good job with a company that's visionary, ambitious, they want to do cool stuff, and I looked it up and Sorcero came up. I kept reading about what Sorcero was doing, and I was like, wow, these guys are ambitious. They're doing cool stuff. But what did it for me was the chat I had with a few people.

Especially Walter, the CTO. That did it for me. I was like I can and I know I will do great stuff, and I will enjoy it. There's a good vision. That's good, and it's ambitious. That's important. We want to push the state of the art. I thought that was good, and it clicked. Personally. That's important, too.

CC: Well, we're very glad to hear that, and that's great. We definitely feel the same. So, you're the Senior Scientist at Sorcero. Could you tell us more about your position?

WS: From what I understood, actually from Walter, is that this position was introduced for me when they decided that they’d also like to have me on board. So it's a new position that was introduced for me. I understood from him that what he would like me to do is look at everything we're doing, and see where I can help push things further. There are a lot of opportunities. It's all over the place. I mean, we're dealing with massive amounts of text. We have a need to do clustering, categorization, extracting key topics, automatic tagging, summarization, semantic search, question answering. It's all over the place. And that's the fun part.

So I'm starting to nibble at these one at a time. But there's a lot and the nice thing about it? There's no ceiling. I mean, you can go all the way to language understanding at some point and do real question answering engaging in a dialogue, and so that's the fun part. There's no limit to how far we can go. There's a lot for me to do, which is fun.

CC: That's great. What do you hope to see come from the work you're doing here, or the work we're all doing here?

WS: Actually, I'm new to this domain, the health industry and related like the pharmaceuticals, the medical science, all that, and the opportunity I see is amazing. I mean, if you can help these professionals whose time is very precious. I mean, these are people that can use the time to do other things, than just to do extraction and summarization and go through 100 articles to find one thing they were looking for. So there's a, there's real value, and applying language intelligence to help them sift through all that content.

If you save time for these guys, the value is there because then you're helping them achieve things better. We're talking about the health industry. Can you imagine how much time you can save them? What's the value, to not only the tech industry, but to society, if you can help medical professionals save time, and focus on the important things? The opportunity is huge.

CC: Absolutely. Related to that, as well as what you've said before about so much data we have, I wanted to bring up something that you've said in a previous interview. “When you look at language as data, you're missing a lot.” Can you explain what this means and how it applies to what we're doing here at Sorcero?

WS: It might not apply immediately. But the first part is, I mean language, we look at a huge corpus of text. I mean, you can get the Library of Congress, all the content that has been written. That's data, right? But not everything that we can understand is in that data.

When we communicate, we leave a lot of things out.

So for example, I say, "Mary enjoyed the movie." I didn't say watching is what she enjoyed, but obviously what she enjoyed is watching, although people can direct and produce and sell and buy movies, but you know I meant Mary enjoyed watching them. Or Mary enjoyed the book, meaning Mary enjoyed reading the book. This is just a small example of what I call in my writing “the missing text phenomenon.” The text that we read or we hear is just a clue to the thought I'm trying to convey. We leave a lot of stuff out because we all have common sense knowledge. We know, we all know what we all know, so the data itself doesn't have all the information.

No matter how much you analyze it, it's missing something else to do the full comprehension. Without getting too technical or into too much detail, the data itself is one part of the puzzle. The other part is common knowledge that has to be engineered in the system to fill in the gaps. Because machines don't know what we all know. That's why it's difficult for machines. So we need approaches that can discover the missing text, the missing information, that's not in the data. Which makes it very challenging.

CC: But fun, right?

WS: Yes. I mean, you're you, you're in a restaurant or in a bar, and you hear the waitress say, “Hey John, the corner table wants another beer.” Tables don't have wants and tables definitely don't drink beer. The missing stuff here is and based on common sense, the people sitting at the corner table want another beer. The data itself is not enough I think that is enough for now about that, because it's a long subject.

CC: I don't know if this is too big of a question, but how could you potentially see that translating into the life sciences industry?

WS: So for now, we want to do - we're now we're sort of picking the low-hanging fruits. We're doing summarization, although not to minimize what we're doing. Because even that part hasn't been done, right? We would like to be, and we will be, the best in this. So summarization, semantic search, retrieval, clustering, categorization, all that stuff that will save professionals lots of time. Once that is done, you can go into full language understanding where you would need this advanced technology. How can you use that?

Imagine reading an article, if you fully understood it, right? For talking language understanding, you can create a form of structure from the content you understood that you can now query. So you can basically create systems that will understand the article like a human would, and build a knowledge graph from it. Then you can query the knowledge graph. Now it's like your query in the article that you fully understood. There's no limit to how far this can go.

CC: It's interesting, too, because it shows that as technology advances, it doesn't minimize the role of the subject matter expert. It just allows them to do more.

WS: It allows them to use their time efficiently, not spending time on tedious stuff that can be automated. Actually, you're empowering them.

CC: That's the goal. Okay, so this is a question I've been eager to ask. I've heard you like to make dome dishes as a hobby. Can you tell me more about that?

WS: Oh, yeah, I mentioned this. Yeah, it's a hobby. For me. It's very therapeutic. I like to invent stuff. But yeah, it's for me, it's like an art. It's, it's like a language. You have the vocabulary, the ingredients, and you can combine them in different ways. So I like to do that. Yeah. And put nice music and invent some new thing. But I have a problem because I do it on the spot, and whatever it feels like to do now. So later on if somebody had some of it and they liked it, and they said, Could you do that again? I forgot all the details.

CC: Each is unique.

WS: Yes, yes, each one is different.

CC: That's great. Is there anything else you'd like to share with the audience?

WS: Ah, well, first the Sorcero audience. Thank you for this. I mean, I really had a good welcome. And I'm really glad I'm here. It’s the third week, and it's amazing that we have great people. I'm really loving it every day more. It's there not only very talented, brilliant people, but very friendly. And everybody is willing to help. You just have to not even complete the question. It's so I'm really glad I'm here. And I know from what I know, so far. This is going to be an amazing journey. And we're going to go places. So I'm glad I'm here. That's for the Sorcero crowd. For others - look me up on Medium, and please challenge me. Correct me and give me feedback. I like debating AI and language.

CC: That's great. Well, we are so lucky to have you. Again, thank you so much for taking the time to join us today. We really appreciate it, and yes, everybody look up Walid on Medium. Thank you again.

WS: Thank you. Thank you.

Watch All of Our Employee Spotlight Videos Here

Christina Cullen

Christina is a skilled content marketer and storyteller with a knack for SEO.

RELATED ARTICLES