May 28 (Bloomberg) –- Nathan Han, who won the Gordon E. Moore Award at Intel’s International Science and Engineering Fair, discusses his project that uses big data for cancer research. He speaks with Cory Johnson on Bloomberg Television’s “Bloomberg West.” (Source: Bloomberg)

How do you define -- i read from people struggling to explain what you did.

I created a process that can be used to predict whether mutations you feed and are likely to create cancer or not.

I did this by mining data from online public the main databases and doing a bunch of to sister: analysis and approaching it with a technique called machine learning.

It's like an algorithm you can use to teach computers how do cost a high data.

In this case, mutation.

Machine learning is pretty basic encoding.

What was unique about the code you wrote?

I used wolfram mathematica.

It is unsupervised which means that it's flexible with the data you can input into its own that allowed me to be able to mine data from online public a main databases.

I did not have a research mentor so i did not have access to proprietary databases.

In previous algorithms that were unsupervised, they only had an accuracy rate of about 40% but mine was 81% accuracy.

I've never heard of the unsupervised function.

It's a machine learning function where the data you train the function on, basically the data you feed into it, you don't have to know beforehand which category it belongs to.

In my case, the data i fed into my algorithm, i did not have to know whether it caused cancer or not to customize for learning.

That's the sort of data that's out there in public to main databases.

In current doldrums, you need to know the data you input summa whether it causes cancer or they don't for clinical outcomes.

Basically, it was allowed to be in an unstructured format and therefore you could look at it a different way.

This would not be limited to understanding genetic material and you could use it in all sorts of things like what?

My algorithm is custom-designed for this biological application.

Theoretically, you could use it for just anything.

Machine learning is really big right now.

Maybe spam e-mail filters.

There's tons of applications using machine learning.

My did you pick this?

I was taking a biology course and it is one really introduced me to these online public to main databases.

There is so much data out there, an incredible amount of cutting-edge data and i was so fascinated.

What can i use this data to use something cool?

After that, one year ago, my family visited a close friend who had been diagnosed with ovarian cancer out of the blue and it inspired me to create a system where you could input mutations and tell you whether they were likely to cause cancer or not.

You are 15 years old.

What's next?

[laughter] i'm not entirely sure.

Now i need a summer job.

Ideally i would like to find a research lab that i could maybe continue my project, maybe get a patent door find a practical application to apply this sort of system to save lives in the real world.

What will you do with your 75 grand?

Not sure yet about that either.

I will probably put most of it away for college tuition and maybe take some classes in the meanwhile.

