数据地址:https://tianchi.aliyun.com/competition/entrance/531830/information
1.导入模块和数据
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import datetime import warnings warnings.filterwarnings(‘ignore‘) data_train = pd.read_csv(‘F:/python/阿里云金融风控-贷款违约预测/train.csv‘) data_test_a = pd.read_csv(‘F:/python/阿里云金融风控-贷款违约预测/testA.csv‘)
2.数据基本认知
data_train.shape,data_test_a.shape
((800000, 47), (200000, 48))
查看y值的分布
原文:https://www.cnblogs.com/cgmcoding/p/13667882.html