代码之家  ›  专栏  ›  技术社区  ›  Prateek Narendra

如何在不同的数组上使用numpy return\u逆表?

  •  -2
  • Prateek Narendra  · 技术社区  · 6 年前

    我有一个使用-

    lookupTable, data_training_panda_y_indexed = np.unique(data_training_panda_y, return_inverse=True)
    

    lookupTable 在另一个数组上 data_cross_validation_panda_y

    data_training_panda_y 是一个字符串列表,可以是这些值-传入、传出、中性。

    所以, lookUpTable ndArray ('Incoming' 'Outgoing 'Neutral')

    目前为止的代码-

    import numpy as np
    import pandas as pd
    from sklearn.linear_model import LogisticRegression
    from sklearn.metrics import accuracy_score
    from numpy import dtype
    from _codecs import lookup
    
    #Load data
    data = np.genfromtxt('../Data/bezdekIris.csv',delimiter=',',usecols=[0,1,2,3,4],dtype=None)
    labels = np.genfromtxt('../Data/bezdekIris.csv',delimiter=',',usecols=[4],dtype=None)
    #Shuffle the rows
    np.random.shuffle(data)
    
    #Cut the data into 3 parts
    data_rows = np.size(data, 0)
    training_rows = int(round(0.6*data_rows))
    cross_validation_rows = int(round(0.2*data_rows))
    testing_rows = data_rows - training_rows - cross_validation_rows
    
    data_training_panda = pd.DataFrame(data[:training_rows])
    data_training_panda_X = data_training_panda.iloc[:,0:4]
    data_training_panda_y = data_training_panda.iloc[:,4]
    
    data_cross_validation_panda = pd.DataFrame(data[training_rows:training_rows+cross_validation_rows])
    data_cross_validation_panda_X = data_cross_validation_panda.iloc[:,0:4]
    data_cross_validation_panda_y = data_cross_validation_panda.iloc[:,4]
    
    data_testing_panda = pd.DataFrame(data[training_rows+cross_validation_rows:])
    data_testing_panda_X = data_testing_panda.iloc[:,0:4]
    data_testing_panda_y = data_testing_panda.iloc[:,4]
    
    #Take out the labels from the 3 parts
    lookupTable, data_training_panda_y_indexed = np.unique(data_training_panda_y, return_inverse=True)
    
    #Label the CV and Testing 
    data_cross_validation_panda_y_indexed = np.array([])
    data_testing_panda_y_indexed = np.array([])
    

    bezdekIris.csv示例数据-

    5.1,3.5,1.4,0.2,Incoming
    4.9,3.0,1.4,0.2,Outgoing
    4.7,3.2,1.3,0.2,Netural
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   klim    6 年前

    使用searchsorted可能是一个解决方案。

    data_cross_validation_panda_y_indexed = np.searchsorted(lookupTable, data_cross_validation_panda_y)