This is a project about machine learning knowleges. Specially It is used for sharing machine learning codes in bioinfomatics. The procedures includes features extraction, features selection, model traning(such as SVM, Randomforest,naive bayes,bagging and so on). The programming language is python.
- fearure_selection(存放特征选择的相关程序)
- before_december_code(代码-2017年12月之前所做的关于机器学习在生物学方面的应用)
- december_code(代码-2017年12月所做的关于机器学习在生物学方面的应用)
- thaliana_code(拟南芥RNA位点识别研究的工作)
- easy_excel.py(用以生成专用的excel表格,需要xlwt包的支持)
- thaliana.fasta(拟南芥数据集)
- Pandas
- sklearn
- numpy
- matplolib _ xlwt
- easy_excel地址
- 拟南芥参考论文:[2] Chen, W., Feng, P., Ding, H. & Lin, H. Identifying N6-methyladenosine sites in the Arabidopsis thaliana transcriptome. Molecular Genetics and Genomics 291, 2225-2229 (2016).
- 部分资料下载地址