My road to data analysis took a long time and was not straight forward. As a child I was home-schooled and received advanced education in mathematics and physics. But when in the 90’s the Soviet Union, and all the republics within it, committed an economic suicide, my chances to get an advanced degree dwindled to zero. It wasn’t until the early 2000’s when I had an opportunity to advance my education after moving to Pullman, WA.
When I started my college career, I was torn apart of what to do. On the one hand, I wanted to do something applied. On the other hand, I enjoyed logic, and mathematics appealed to me because of its abstract side and its ability to generalize. So, a degree in computer science seemed like a happy medium between the two. Very early into my undergraduate degree I got involved in scientific research, but it wasn’t until I took a course in Machine Learning that I knew where I would like to be. However, it took another couple years for me to figure out what exactly I wanted to do.
As I looked at different (applied) problems in data analysis, I became aware of difficulties associated with analyzing high-dimensional data (overfitting, computational strain, etc.), but dimensionality reduction methods might not be always appropriate to use. At the same time there are strong indications that distributions in high dimensions do not behave the same way they behave in lower dimensions. A prominent example of this is the concentration of measure phenomenon. So, I looked for alternative ways to describe data. That led me to the field of topological data analysis which is what I’m currently working on.
High dimensional data is here to stay with us long time, but it is hard, if not impossible, to visualize. That robs us of the ability to develop intuition about it, but mathematics can provide us tools that can describe data precisely. My long-term hope is to develop a set of mathematical methods that can deal effectively with the curses while exploiting the blessings of high-dimensionality.
Svetlana works on data analysis and machine learning. She is a PhD student in EECS (her EECS advisor is Shira Broschat ) and is working with Bala Krishnamoorthy on problems that apply computational topology to data.