linear models features and export_text_m5

Hi,

I have a doubt related to the linear models. I have 16 features in my dataset when fitting the M5Prime model.

modelM5 = M5Prime(max_depth=4)
modelM5.fit(x_train.values, y_train.values)

After fitting, this is the final part of the output of export_text_m5:

LM1: 2.165e-02 * X[0] + 2.675e-05 * X[1] - 9.072e-06 * X[2] - 1.355e-04 * X[3] + 7.039e-05 * X[4] + 3.113e-02 * X[5] + 2.730e-01
LM2: 1.985e-02 * X[0] + 2.820e-05 * X[1] - 9.072e-06 * X[2] - 1.355e-04 * X[3] + 7.039e-05 * X[4] + 3.113e-02 * X[5] + 2.764e-01
LM3: 1.985e-02 * X[0] + 2.931e-05 * X[1] - 9.072e-06 * X[2] - 1.355e-04 * X[3] + 7.039e-05 * X[4] + 3.113e-02 * X[5] + 2.776e-01
LM4: 3.175e-02 * X[0] + 1.754e-05 * X[1] - 9.072e-06 * X[2] - 3.401e-04 * X[3] + 7.039e-05 * X[4] + 6.773e-02 * X[5] + 4.613e-01
LM5: 3.175e-02 * X[0] + 1.754e-05 * X[1] - 9.072e-06 * X[2] - 3.265e-04 * X[3] + 7.039e-05 * X[4] + 6.773e-02 * X[5] + 4.393e-01
LM6: 2.128e-02 * X[0] + 1.754e-05 * X[1] - 9.072e-06 * X[2] - 1.631e-04 * X[3] + 7.039e-05 * X[4] + 2.011e-01 * X[5] + 6.056e-01
LM7: 3.445e-02 * X[0] + 6.073e-05 * X[1] - 6.333e-04 * X[2] - 5.438e-05 * X[3] + 2.951e-04 * X[4] + 8.026e-02 * X[5] + 4.717e-01
LM8: 3.445e-02 * X[0] + 6.073e-05 * X[1] - 2.718e-04 * X[2] - 5.438e-05 * X[3] + 6.259e-05 * X[4] + 3.223e-02 * X[5] + 3.194e-01
LM9: 3.445e-02 * X[0] + 6.073e-05 * X[1] - 2.718e-04 * X[2] - 5.438e-05 * X[3] + 6.979e-05 * X[4] + 3.223e-02 * X[5] + 3.191e-01
LM10: 1.050e-02 * X[0] + 1.643e-04 * X[1] - 1.301e-05 * X[2] - 5.438e-05 * X[3] + 1.801e-05 * X[4] + 6.602e-02 * X[5] + 2.598e-01
LM11: 1.050e-02 * X[0] + 2.685e-05 * X[1] - 1.301e-05 * X[2] - 5.438e-05 * X[3] + 1.801e-05 * X[4] + 2.772e-02 * X[5] + 4.189e-01
LM12: 1.050e-02 * X[0] + 2.685e-05 * X[1] - 1.301e-05 * X[2] - 5.438e-05 * X[3] + 1.801e-05 * X[4] + 1.761e-02 * X[5] + 4.762e-01

However, when I look at the features variable of each node_model:

[x.features for x in modelM5.node_models]

I get somethink like:

[[0, 4, 8, 10, 12, 15],
[0, 4, 8, 10, 12, 15] ,
[0, 4, 8, 10, 12, 15],
(...)
]

Doesn't this mean that the variables being used for the linear model are different than the ones being printed in the export_text_m5 method? Is X[0] really the first feature from my inputs? Or from the features that the linear model used?
Right now the predictions are working fine but they don't add up with explicitly using the linear model coefficient by hand.
Is there some kind of issue here of something wrong from my side?

Thank you in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linear models features and export_text_m5 #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

linear models features and export_text_m5 #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions