有没有办法可视化决策树(sklearn)与从一个热编码功能合并的分类功能?

时间:2017-10-25 15:22:00

标签: python scikit-learn graphviz decision-tree one-hot-encoding

这是{。3}}到.csv文件。这是一个经典的数据集,可用于练习决策树!

import pandas as pd
import numpy as np
import scipy as sc
import scipy.stats
from math import log
import operator

df = pd.read_csv('tennis.csv')

target = df['play']
target.columns = ['play']
features_dataframe = df.loc[:, df.columns != 'play']

这是我头痛的开始

features_dataframe = pd.get_dummies(features_dataframe) 
features_dataframe.columns

我正在对features_dataframe中存储的功能(数据)列执行一次热编码,这些列都是分类并打印它,返回

Index(['windy', 'outlook_overcast', 'outlook_rainy', 'outlook_sunny',
   'temp_cool', 'temp_hot', 'temp_mild', 'humidity_high',
   'humidity_normal'],
  dtype='object')

我明白为什么需要执行单热编码! sklearn不适用于分类列。

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
le.fit(target.values)

k = le.transform(target.values)

上面的代码将我在target中存储的目标列转换为整数,因为sklearn不能使用类别(YAY!)

现在最后,拟合DecisionTreeClassifier,criterion = "entropy"是我假设使用的ID3概念!

from sklearn import tree
from os import system

dtree = tree.DecisionTreeClassifier(criterion = "entropy")
dtree = dtree.fit(features_dataframe, k)


dotfile = open("id3.dot", 'w')
tree.export_graphviz(dtree, out_file = dotfile, feature_names = features_dataframe.columns)
dotfile.close()

文件id3.dot具有可以粘贴在此link上的必要代码,用于将有向图代码转换为适当的可理解可视化!

为了让您有效轻松地帮助我,我会将id3.dot的代码发布在此处!

digraph Tree {
node [shape=box] ;
0 [label="outlook_overcast <= 0.5\nentropy = 0.94\nsamples = 14\nvalue = [5, 9]"] ;
1 [label="humidity_high <= 0.5\nentropy = 1.0\nsamples = 10\nvalue = [5, 5]"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label="windy <= 0.5\nentropy = 0.722\nsamples = 5\nvalue = [1, 4]"] ;
1 -> 2 ;
3 [label="entropy = 0.0\nsamples = 3\nvalue = [0, 3]"] ;
2 -> 3 ;
4 [label="outlook_rainy <= 0.5\nentropy = 1.0\nsamples = 2\nvalue = [1, 1]"] ;
2 -> 4 ;
5 [label="entropy = 0.0\nsamples = 1\nvalue = [0, 1]"] ;
4 -> 5 ;
6 [label="entropy = 0.0\nsamples = 1\nvalue = [1, 0]"] ;
4 -> 6 ;
7 [label="outlook_sunny <= 0.5\nentropy = 0.722\nsamples = 5\nvalue = [4, 1]"] ;
1 -> 7 ;
8 [label="windy <= 0.5\nentropy = 1.0\nsamples = 2\nvalue = [1, 1]"] ;
7 -> 8 ;
9 [label="entropy = 0.0\nsamples = 1\nvalue = [0, 1]"] ;
8 -> 9 ;
10 [label="entropy = 0.0\nsamples = 1\nvalue = [1, 0]"] ;
8 -> 10 ;
11 [label="entropy = 0.0\nsamples = 3\nvalue = [3, 0]"] ;
7 -> 11 ;
12 [label="entropy = 0.0\nsamples = 4\nvalue = [0, 4]"] ;
0 -> 12 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
}

转到site,然后粘贴上面的有向图代码,以便正确显示创建的决策树!这里的问题是,对于较大的树和较大的数据集,由于一个热编码特征显示为表示节点拆分的特征名称,因此难以解释!

是否有解决方案,决策树可视化将显示合并的要素名称,以表示来自单热编码要素的节点拆分?

我的意思是,有没有办法创建像here

这样的决策树可视化

1 个答案:

答案 0 :(得分:0)

不使用One-Hot编码可能更简单,而是使用一些任意整数代码来表示特定功能的类别。

您可以使用class ImageViewController: UIViewController { let marginSpacing = CGFloat(10) lazy var imageView1: UIImageView = { let view = UIImageView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var imageView2: UIImageView = { let view = UIImageView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var imageView3: UIImageView = { let view = UIImageView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var imageView4: UIImageView = { let view = UIImageView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var imageView5: UIImageView = { let view = UIImageView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var contentView: UIView = { let view = UIView() view.translatesAutoresizingMaskIntoConstraints = false return view }() lazy var scrollView: UIScrollView = { let view = UIScrollView() view.translatesAutoresizingMaskIntoConstraints = false return view }() override func viewDidLoad() { super.viewDidLoad() self.title = "Image Auto-Scale" // Do any additional setup after loading the view. setupImageView() } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() // Dispose of any resources that can be recreated. } /* // MARK: - Navigation // In a storyboard-based application, you will often want to do a little preparation before navigation override func prepare(for segue: UIStoryboardSegue, sender: Any?) { // Get the new view controller using segue.destinationViewController. // Pass the selected object to the new view controller. } */ fileprivate func setupImageView() { updateImageViews() colorImageViews() addImageViews() setupImageConstraints() } fileprivate func updateImageViews() { self.navigationItem.setRightBarButton(UIBarButtonItem(barButtonSystemItem: .done, target: self, action: #selector(tappedBackButton)), animated: true) imageView1.image = #imageLiteral(resourceName: "Sample") imageView1.contentMode = .scaleAspectFill imageView2.image = #imageLiteral(resourceName: "Sample") imageView2.contentMode = .scaleAspectFill imageView3.image = #imageLiteral(resourceName: "Sample") imageView3.contentMode = .scaleAspectFill imageView4.image = #imageLiteral(resourceName: "Sample") imageView4.contentMode = .scaleAspectFill imageView5.image = #imageLiteral(resourceName: "Sample") imageView5.contentMode = .scaleAspectFill } fileprivate func colorImageViews() { contentView.backgroundColor = UIColor.white } fileprivate func addImageViews() { contentView.addSubview(imageView1) contentView.addSubview(imageView2) contentView.addSubview(imageView3) contentView.addSubview(imageView4) contentView.addSubview(imageView5) scrollView.addSubview(contentView) view.addSubview(scrollView) } fileprivate func setupImageConstraints() { if #available(iOS 11, *) { NSLayoutConstraint.activate([scrollView.topAnchor.constraint(equalTo: view.safeAreaLayoutGuide.topAnchor), scrollView.bottomAnchor.constraint(equalTo: view.safeAreaLayoutGuide.bottomAnchor)]) } else { NSLayoutConstraint.activate([scrollView.topAnchor.constraint(equalTo: topLayoutGuide.bottomAnchor), scrollView.bottomAnchor.constraint(equalTo: bottomLayoutGuide.topAnchor)]) } NSLayoutConstraint.activate([scrollView.leadingAnchor.constraint(equalTo: view.leadingAnchor), scrollView.trailingAnchor.constraint(equalTo: view.trailingAnchor), scrollView.widthAnchor.constraint(equalTo: view.widthAnchor), contentView.leadingAnchor.constraint(equalTo: scrollView.leadingAnchor), contentView.trailingAnchor.constraint(equalTo: scrollView.trailingAnchor), contentView.topAnchor.constraint(equalTo: scrollView.topAnchor), contentView.bottomAnchor.constraint(equalTo: scrollView.bottomAnchor), contentView.leadingAnchor.constraint(equalTo: view.leadingAnchor), contentView.trailingAnchor.constraint(equalTo: view.trailingAnchor), contentView.heightAnchor.constraint(equalToConstant: 1000), imageView1.topAnchor.constraint(equalTo: contentView.topAnchor, constant: marginSpacing), imageView1.widthAnchor.constraint(equalTo: contentView.widthAnchor, multiplier: 0.5, constant: 0), imageView1.heightAnchor.constraint(equalTo: imageView1.widthAnchor, multiplier: 3.0/4.0, constant: 0), imageView1.centerXAnchor.constraint(equalTo: scrollView.centerXAnchor), imageView2.topAnchor.constraint(equalTo: imageView1.bottomAnchor, constant: marginSpacing), imageView2.widthAnchor.constraint(equalTo: contentView.widthAnchor, multiplier: 0.75, constant: 0), imageView2.heightAnchor.constraint(equalTo: imageView2.widthAnchor, multiplier: 2.0/3.0, constant: 0), imageView2.centerXAnchor.constraint(equalTo: scrollView.centerXAnchor), imageView3.topAnchor.constraint(equalTo: imageView2.bottomAnchor, constant: marginSpacing), imageView3.widthAnchor.constraint(equalTo: contentView.widthAnchor, multiplier: 0.6, constant: 0), imageView3.heightAnchor.constraint(equalTo: imageView3.widthAnchor, multiplier: 1.0/1.0, constant: 0), imageView3.centerXAnchor.constraint(equalTo: scrollView.centerXAnchor), imageView4.topAnchor.constraint(equalTo: imageView3.bottomAnchor, constant: marginSpacing), imageView4.widthAnchor.constraint(equalTo: contentView.widthAnchor, multiplier: 0.4, constant: 0), imageView4.heightAnchor.constraint(equalTo: imageView4.widthAnchor, multiplier: 1.0/3.0, constant: 0), imageView4.centerXAnchor.constraint(equalTo: scrollView.centerXAnchor), imageView5.topAnchor.constraint(equalTo: imageView4.bottomAnchor, constant: marginSpacing), imageView5.widthAnchor.constraint(equalTo: contentView.widthAnchor, multiplier: 0.5, constant: 0), imageView5.heightAnchor.constraint(equalTo: imageView5.widthAnchor, multiplier: 3.0/4.0, constant: 0), imageView5.centerXAnchor.constraint(equalTo: scrollView.centerXAnchor)]) } @objc func tappedBackButton (sender: UIBarButtonItem!) { self.dismiss(animated: true, completion: nil); } } 对分类变量进行整数编码。