Our clients typically need to quickly complete their machine learning projects. They also need to be flexible and nimble without adding head counts. Our outsourced data preparation service takes care of your ETL tasks (Extract, Transform, Load) and delivers quick and accurate data for their needs. We help with the 80% part that slows down the analytics portion.
This 80% typically involves : Data preparation tasks like Data Digitization, Data Aggregation/Consolidation, Data Entry, Data labeling.
Our trained and experienced teams off load these tasks and help your team accelerate your data analytics projects. We understand your needs. We can work with Microsoft Access, PDF files, Excel, Statistical files and connect to data on Amazon Athena, Aurora, EMR Hadoop Hive, Redshift, Microsoft Azure, Apache Drill, Aster Database, Cloudera Hadoop, Denodo, EXASOL, Google Cloud, Horton works Hadoop Hive, IBM DB2, IBM PDA (Netezza) and more!
Data Digitization is involved with scanning paper based forms and applying OCR with human editors to ensure that the digitized data is accurate. Most often, data are in different silos. Our teams can consolidate this into one dataset for your teams to work on.
Typical Tasks at this stage includes:
Cleansing (Correcting inconsistent Data, renaming field names and formatting, outliers and grouping common values)
Integration (Union and Joins of various input sources)
Transformation (Aggregation, creating new calculated fields, pivot data, creating links among data sources and reshaping data)
Annotation Data, Text, Image, Video, Audio labeling service to provide your machine learning project with Labeled Data.
Reduction (removal of unwanted data)
Data Entry involves human intervention in completing those missing values that would otherwise invalidate the records. Data editing is also needed to normalize the inputs as well.
Finally, data labeling (Annotation) helps you train your AI process. Our teams have labeled hundreds of thousands of images for past projects.
Finally, our project managers have data analytics background that understands your team’s need for data quality. Contact us for a zoom conference to find out more.
Robolawyer startup DoNotPay launched Robo Revenge. It enables users to sue any US-based company that spams you with robocalls. The service generates a fake credit card number that you give to the telemarketer. Once they use it, their details are then recorded in the system.
The robocaller’s details and then helps you sue them for as much as $3,000. The service is live now on DoNotPay’s website and in its app.
In a Dimensional Research survey sponsored by Zendesk in 2013: What makes for a BAD interaction? 72% said it was having to explain a problem to multiple people 51 % on the problem not being resolved.
Not only are customers most frustrated with the way customer service issues are handled, 58 percent said they were more likely to share customer service experiences today than they were five years ago, with more and more people sharing experiences on social networking sites and writing online reviews
KLM is the Dutch airline that evolved its use of Twitter in unexpected ways. They originally used twitter as a platform for social media marketing. They quickly realized that it could also be used as a platform for communicating service disruptions. This proved invaluable during the 2011 Iceland volcano eruption.
Are you looking to increase the effectiveness of your outbound/telemarketing campaigns? Our system leverage advance Machine Learning algorithm that looks at your data. The system then generates ‘rules’ that help us co-relate your leads to most effect ‘benefits statements” (Ie. Cost benefit, Time benefit, quality benefit) .
This means that our telemarketing team can connect more effectively with your leads by using the benefit statements that connect or matter most to your leads.
Learn more. Chat with us now by clicking on the tab at the bottom right.
[Synthesized Speech in the likeness of Pres Donald Trump] [Laughter]
Dr. Kai Fu Lee: So this slide shows you the power of deep learning because it is built on deep learning technologies and it could re-synthesize speech like President Trump and that was not him speaking, it was by the synthesis system.
And the deep learning is a big technology breakthrough that can do the following in a single domain. We can write one domain only with a huge amount of training with superhuman performance.
So when trained on Amazon clicks, it learns how to make money, the most money from people entering the sites. When trained on large amounts of President Trump’s speech, it talks like him. When trained on CT of lung cancer and non-cancer, it learns to distinguish them from each other. And when trained on GO games, it runs to play the game GO with human capability. So, that has been the most amazing breakthrough in the past 63 years of AI history.
And in terms of applications, it actually oversaw many fields. Many people get confused and think only robots and Autonomous Vehicles are AI. But actually the same deep learning algorithm goes through these four waves of artificial intelligence. Let me start from the first wave. It’s the Internet wave.
Naturally, (unintelligible) in English we collect the most failure and it’s automatically labeled. We are guinea pigs labeling (candidate) for Facebook, Amazon, and Google when we use them and cook, it learns what you like and don’t like.
And then – and those what I’m going to show you and that’s how the websites are no better than usual but things you want to click on. [Background noise] And today has so much revenue.
What’s more, each of the implementation of the deployment for internet websites comes within objective function maximizer. So, Amazon can choose to maximize revenue or profits. And Facebook can choose to maximize virality or minutes per user.
So, imagine the job of the CEO suddenly got a lot easier. And that’s also why these internet companies are near (shooting down) their companies, just because they have a lot of training data and (financial) (that helps them) make more money.
Now, second wave. It’s applicable equally to businesses that have lots of data – to banks, insurance companies, hospitals, government branches, so on and so forth. Anyone who’s had a data repository can now use it to apply AI to it.
Take as an example (unintelligible) bank. Whenever I meet with bank CEO, there’s always a fourth person sitting in the back of the room looking at the record. That is the head of the data center because that person generates no profits, no revenue. It’s merely a cost center storing data for archival purposes.
But guess what, that data center and that person has just turned into a mountain of gold. Because with all the customer transactions and history, a company, a bank can now optimize customer asset allocation, can help make more money, can help minimize the fault, can help detect credit or fraud, so that is phenomenal and similarly for insurance companies and so on.
But before you start thinking, it is only to use by – to the used by traditional companies. And let me show you an example where it tend to use to disrupt companies.
We found that AI long application and it’s (an app every time) [Audio Break] and when you download it, you can fill in all the blanks such as your income, your name, your workplace, and so on about 10 fields. But also, you have to upload your phone data up to the long application decision maker.
And that data isn’t everything on your phone obviously. It’s only things permitted by our iOS and Android. It’s the same things that Facebook and Twitter take; nothing more and nothing less.
And – well, the system will do is decide whether to give you a loan based on what you entered and what you uploaded. And – so think about it. If you were to walk outside the National University and run in through 3000 strangers, can you pick 1000 of them to each, you loan SG$500. And there goes (SG$500,000). What percentage of default rate might you get? Very high, right? Maybe 80%, 90%. Okay, Singapore maybe 60%. [Laughter] But still a high number.
So, guess again what this – default rate for this loan company is 3%. So, how does he manage to do that? Because it has so much information from the phone that you regard as useless. It’s actually you little bits of valuable information.
Well, together correlated with each other to make a very smart decision. This obviously includes all (unintelligible) typed but also how fast you typed it, that makes the difference because if you’re faster you’re probably copying and type slower.
Also has information like what date of the month is it. Is it before payday or after payday? Before pay day, it’s a good loan. After payday, not so good because you just got paid, why do you need money?
We would also have the information on (what else do you have). If you have the games. You have gambling apps, you have illegal apps or perhaps you have all serious knowledge apps. That makes a difference on your loan. The model of your phone, that make a big difference – very big difference.
And also, once you come (back to us), I’m going to keep (right in it). Are they real people? Is the person you called “dad” really your dad? So on and so forth. These are all things that can be found merely using the data you upload and it’s available on the internet.
So, just out of curiosity we asked how many such features are there. There were 3000. And just out of curiosity we asked, what’s the least important feature? It turns out to be your battery level.
So, why does that matter at all? Well, if your cellphone glass, LCD, who plugs your phone all the time, you’re probably a little bit correlated with someone who returns loans. If you’re someone who keeps battery and your phone run out of battery, you’re probably little bit correlated to someone who defaults.
So obviously, that is not an important factor. That might be 1 billionth of all the information that’s still there, but all of it is considered. And in aggregate it’s something (worth looking into).
Moving on to the third wave, that’s when machine starts to see and hear and sense with things like cameras, microphones, e-sensors, movement sensors, [background noise] maybe reconstruction devices (unintelligible). All of that gathers data that before was transmitted, non-existent and now can be used to make smart decisions.
I’ll give you an example. You all know, (that video that he suddenly) (unintelligible) the police mentioned. [Laughter] Do you know why? Because in the last concert he gave, over 20 people were arrested from the most wanted list. Because the stadiums were – they were – they had cameras installed. And the cameras were connected to face recognition to the criminal database.
And as a result, the police apprehended some number of people, to some of them, “Oops, sorry. Your ID looks good. My mistake. Please enjoy the concert.” But to the unfortunate 20 something people who were actual criminals that (thought) for one night sneak out from wherever they were hiding enjoy a concert – they were caught.
This talk is not about whether that’s a great act or when that worries people, but it’s about power of face recognition. No human can possibly even if you took the 1000 best policemen in China, they couldn’t possibly identify 25 criminals out of a field of 60,000 people in the concert because we simply don’t have the memory or the face recognition capability. And we get tired, the fatigue overcome us. Can’t possibly do that.
So, this hopefully the lone example and the example at that concert will show you that AI is far surpassing our capabilities. If you satisfy the requirement, one domain, lots of data.
Now, this – in the third wave also will have autonomous stores that recognize your face, your movement, your gesture, your intent that you pick up this bottle of water and you may be bought it, have put it in the basket and it charges you, or maybe you’ll look at with disgust and put it back. That indicates something. Turning an offline store with the same power as Amazon or online. What do you think from it? That just put there then security comes up. That’s it. Don’t drink from my water.
So, all these things are the kind of (unintelligible) in wave three and it will automate a lot of things with smart vision and hearing.
Wave four is robotics and autonomous vehicles. Most people know a lot about those already. But I can tell you, robots are actually difficult to do everything humans can do. I’m on the board of Foxconn. And I can tell you, it’s not going to be easy displacing people who make iPhones, those that require the level of dexterity and hand-eye coordination.
However, there are many jobs that are stationary; repetitive that can be displaced by robots. For example, watering plants selectively with just the right amount of water and fertilizer based on the growth as observed by computer vision. Such as picking fruits, such as doing dishes, doing inspection on assembly. So we’re still talking about tens of millions of jobs, but just not very high dexterity jobs.
And autonomous vehicle is of course the biggest of all breakthroughs that will change the way we transport ourselves. Logistics delivery, it will make life so much more efficient, convenient. It will make the air safer; there will be fewer fuel cars in the road. You will no longer have to buy cars, and it will be a lot safer – especially over time. Because one very important thing is more data makes AI possible, and more, more, and more data makes AI better.
So the moment that autonomous vehicles will be launched hopefully, it’s pretty safe. And then five years later, it becomes extremely safe. Five more years, even safer. So it keeps getting better and better. So that’s the fourth wave.
Each of these waves represents something like 5% increment to say to the GDP. Also, represent some 5% of jobs displaced. This slides shows you the key things of one makes AI working. Massive data, like very good labeling and a single domain. And usually, you need a lot of compute power and some AI experts.
So who has the AI experts? Obviously, America – more specifically, America with some British and Canadians are by far the leaders in the world. So, why – we just thought when I say that China has a chance because the three very important observations to make.
The first is they’re in very few breakthroughs. So, people generally assume there are lot of breakthroughs because you read about them in the paper. But actually, all the breakthroughs that you read in the paper are built on deep learning or similar technologies.
And deep learning is fairly well understood and nor can we expect, you ask to continuously come up another big breakthrough because after all, in the last 62 years, there was only one breakthrough. Why do they think there will be five more in the coming years?
Secondly, because the technologies are well understood, we’re now moved beyond technology, discovery, and disruption. We’re moving to taking the mature technology and applying them. Just like the early days in the internet. The discovery of TCP/IP, amazingly important. The building of the web browser, amazingly important. The invention of electricity, amazingly important. But there were never was a TCP/IP 2.0 that disrupted everything. There was a 2.0 but it was a small net. There were never was electricity 2.0 that destroyed everything. It was the 1.0, when ground work was done and that all the application built on top of TCP/IP, the browser, of electricity. So I like to say that deep learning is like these things. So we’re in the era of implementation.
Thirdly, AI is a very open domain. All that is known is largely in the open source. If you want to learn AI and taking up courses, all of the open sources out there you can build when you want to build. So, companies compete on this implementation and the ideas, and how quickly can they make money, not on the breakthroughs. Because the breakthroughs are done, we understand them. They’re in the open space.
Now, how does China build on these three things? Well first, China has a lot of great AI researchers and engineers. They are not as similar as the American ones, this chart you see of all the AI papers 42% have Chinese authors, and 40% – 2% of all the authorss are Chinese.
And Chinese can innovate in these products. You see, the Chinese used to be copycats, but Chinese had become equal to US in between stage and now, has had leapfrogged to build $300 billion of value in the orange slice that you see.
And the Chinese entrepreneurs, they are hungry. They’re good at finding business opportunities. They work hard. They built barriers because they’re surrounded by copycats. They have to build products that are uncopiable. The only uncopiable product is one that takes – that is built a moat around it. That’s also hard to copy.
So, for example, Meituan’s 600,000 delivery people in the infrastructure vehicles cost you billions of dollars. For example, TD going into, buying these vehicles, leasing them, insurance, gas stations, that locks up the domain. So that’s the Chinese method of competition. It’s very good fit with AI. They’re built up over time. They’re capital intensive. They use a lot of money and then they build a moat that’s very hard to imitate.
The fourth reason is China has a lot of money. A lot of money flow to China. AI – this is not government money. This is private money. And a lot of this money goes into funding Chinese AI companies. Chinese AI funding exceeded US in the last year. [Cough]
And as an example, these are the five of our unicorns. So these are AI companies we invested in that become over $1 billion in market capitalization and total value is $21 billion. And the newest of these companies was founded only two years ago. These are concepts that I think are unheard of or perhaps not even believed in Singapore. But now you’ve heard it, so you should believe it.
The fifth reason is the power of massive data. The right chart shows you, the more data, the better it performs. In fact, in AI we have a saying called “There’s no data like more data”. Anybody care to guess who said that? A gentleman named, Dr. Bob Mercer, the founder of Cambridge Analytica. [Laughter] A very famous esteemed AI researcher who turned into additional loss. [Laughter]
And so in the age of AI, if data is getting oil [Cough] and China is the new OPEC. China has not only more people but also more usages. Chinese people use take outs more because in China you can get food delivered to you from 500 restaurants in 30 minutes, costing US$0.70 per delivery and that is the amazing thing that causes Chinese people to have more depth in usage and that’s where the data comes from.
A lot of people in the West assume that Chinese people just don’t care about data, companies’ trade data. Government is always aware. [Cough]. That’s not true. The companies behave much less [background noise] (than Western do things). But it’s just that there are more people and they use the data more.
In particular, I want to point out the use of mobile data is particularly important because mobile data is the most valuable data. It is – you are paying for something. It’s not just “put it on the page” but you are paying something, and in it you want something and that can be used as a rocket fuel to learn a lot of great AI. [Cough]
And finally Chinese company [cough] strongly supports AI. [Cough] And Chinese policies tend to be techno-utilitarian, which means try it out and then regulate it only if issues occur. So with mobile payment that may have (been stopped) in the US because credit cards may raise the issue that software companies can be hacked or can be fraud, or can’t be trusted with managing your money. But China will trust Alibaba and Tencent as long as they live up to their worth and they were proven trustworthy. So they’ve taken over the credit card space.
And also China has an AI plan on the left side wanting to be the global best by 2030. And then with that plan, each enterprise in each city may come up with specific plan.
So, for example, the state owned banks. Once the government said AI is important – they might procure some AI software. And city of NanJing said our schools are very good. Let’s build the world’s largest AI science park. And China is happy to build a new city called “Xiongan” which has autonomous vehicle built-in with top layer for pedestrians, bottom layer for cars. Thereby avoiding the kind of accidents we saw in Phoenix with Uber autonomous.
So as a result, we anticipate that China will catch up with US somewhere between now and the next five years. And that most important message to take is that China and US will be by far the co-leaders in AI.
Who exactly will be ahead, really depends on a lot of things. This shows the projection China’s slightly ahead but only in implementation. US is currently ahead in research. So new technologies invented that put US back in the lead. But what is clear is that in this race, there are not three medals like the Olympics. There are only two medals. And they belong to US and China. Who gets the gold remains to be seen but there is no bronze medal.
AI will create a huge amount of value, about $16 trillion net additional GDP, but it will also bring a lot of challenges. And due to the interest of time, I’m just going to cover one issue which is job displacement. That is with AI being able to do so many jobs, are all our jobs going to be taken away? Well, it is not.
If you think about what AI cannot do, there are two sets sorts of things. One is creative things and the other is things that require empathy, compassion, people-to-people connection. So these three attributes separate all the jobs in past that we do will find and in fact on the lower left, all the jobs will be taken by AI. And that’s of concern and we need to do something about that.
But the jobs on the lower right is a perfect example of human AI symbioses. With AI tool helping scientists find more cure for cancer. On upper left, we will find that AI can be the analytical core, while the human provides the warmth. For example, in the case of a physician, AI can do the diagnosis then the physician connects with the user, to the patient. Here’s – gets the patient to tell all the problems and enters it in AI engine and provides the comfort and confidence, thereby maximizing likelihood of recuperation but also making cost of healthcare much lower.
And then on the upper right side is where humans will excel in both compassionate as well creative skill sets. So we do have something to worry about. In the lower left, well, we also have a lot to celebrate on the other three components.
But the most important thing I think is we’ll look further out in the future. I think your children, for those of you who are students are children, for those of you who are teachers, your grandchildren; they will probably enjoy an amazing life. Because by the time they will get the effect of AI, they will only see that AI has liberated us from doing routine jobs. Allowing us to have a lot more free time to love the people we love, to do the things we’re passionate about and to have time to think about what it means to be human.
And for those of you who are little fearful of AI, remember it is just a tool. We’re the only ones who have the free will. We will control the AI tools and we get to write the ending to the AI story. Thank you.