Machine LEARNING, STATISTICAL LEARNING, DATA MINING
Machine Learning/ Statistical Learning/ Data Mining
World-class Expert: Alexander G. D'yakonov, D.Sc.,Ph.D., Associate Professor, Moscow State University, Mathematical Methods of Forecasting Department
Currently #2 ranked modeler/data miner on Kaggle.com , the competition center for data miners from across the world.
Profile: http://www.kaggle.com/users/3090/d-yakonov-alexander
Methods:
- Decision Trees, Random Forests
- Support Vector Machines (SVM)
- Artificial Neural Network Models (ANN)
- Bayesian statistics: naive Bayesian classifier, Bayesian hypothesis testing
- Nearest k-neighbors models
- Ensemble models and "voting" approaches to modeling
- Adaboost and other boosting techniques
- Bootstrapping
- Genetic algorithms
- Non-parametric regression
- Time series analysis by Fourier transform
- Monte Carlo Method
- Principal Component Analysis (PCA)
- Clustering - hierarchical, neural network or k-means
Services
- Algorithm development, testing, and evaluation for trading financial instruments (stocks, options, currency exchange pairs). Computational simulations on real data and simulated data that is close to real data.
- Data mining for trends or performance metrics extracted from databases (surveys, census, claims, hospital records, consumer behavior, cell phone location information, etc.)
- Developing a model for predicting which of two or more classes a data point will fall into.
- Developing a model for predicting a continuous variable (i.e. regression) Ex: How much money will a person spend next time they enter retail store? How many days will elapse since a specific person visits this store again?
- Developing a model to predict who a person will "friend" next on Facebook or "connect" with next on LinkedIn.
- Predicting which lecture videos a person will want to watch next based on other people's preferences and correlations. Applicable to movies, books, etc., as long as there is a large group of people rating the same products.
- Analyzing brain recording data (EEG) to differentiate between two or more mental states. In general, signal analysis by statistical means.
- Developing new diagnostic tests and extracting biomarkers with predictive ability for a disease/medical condition.
Software:
- MatLab
- R
- WEKA
- RapidMiner
- SAS