Skip to content

Latest commit

 

History

History
19 lines (14 loc) · 647 Bytes

README.md

File metadata and controls

19 lines (14 loc) · 647 Bytes

qunar_data_mining

Data Preprocess, ML, Regression, Classification, Text Analysis

preprocessing

  • qunar_data_merged.csv: raw data

  • regression.csv: final regression data

  • classfication_all_14features.csv: final classification data

  • preprocessing.ipynb: the first step, data cleaning

  • cluster_variable_select.ipynb: the second step, dimensionality reduction

  • R_reg_preprocess.R: generate regression data by R

  • R_cla_preprocess.R: generate classification data by R

data_operate

  • class_train_all.csv: classification train data used

  • class_test_all.csv: classification test data used

  • R_operate.Rmd: classification by ML with R