RewHier

RewHier is a package for taxonomy modification using flattening and rewiring approach that has been proposed in the paper.

All the codes have been tested in windows environment. However, the same code should also work for unix environment provided you have compiled version of liblinear package for unix (in addition you need to copy ovrtrain and ovrpredict file from matlab folder in the liblinear package).

Source code: Code
Dataset : Dataset

Dataset Description

train,validation,test - All dataset used for experiments are in libsvm format.

Input file format

<labelid> <index1:value1> <index2:value2> ...

Example

1 1:0.32 3:0.72 8:1.00 12:0.42
2 1:0.53 5:0.73 9:0.20
...

hierarchy - All hierarchy are represented in edge-list format.

Hierarchy file format

<parentid> <childid>
<parentid> <childid>
...

Example

0 15
0 25
15 22
22 28
...

Readme Description

Model Training and Prediction - Code to train models and make prediction on test instances.
flat: This folder contains code to learn one-vs-rest LR models ignoring hierarchy.
- flat_model_learning.m: learn model parameters.
- flat_test_prediction.m: perform prediction on test datasets.

td_one_vs_all: This folder contains code to learn one-vs-rest LR models in top-down manner.
- td_one_vs_all_model_learning.m: learn model parameters.
- td_one_vs_all_test_prediction.m: perform prediction on test datasets.

td_one_vs_all_level_flatten: This folder contains code to learn one-vs-rest LR models in top-down manner on flattened hierarchy.
- td_one_vs_all_level_flatten_model_learning.m: learn model parameters.
- td_one_vs_all_level_flatten_test_prediction.m: perform prediction on test datasets.

td_one_vs_all_flattening_approach: This folder contains code to learn one-vs-rest LR models in top-down manner on proposed hierarchy.
- td_one_vs_all_flattening_approach_model_learning.m: learn model parameters.
- td_one_vs_all_flattening_approach_test_prediction.m: perform prediction on test datasets.

td_one_vs_all_rewiring_approach: This folder contains code to learn one-vs-rest LR models in top-down manner on proposed hierarchy.
- td_one_vs_all_rewiring_approach_model_learning.m: learn model parameters.
- td_one_vs_all_rewiring_approach_test_prediction.m: perform prediction on test datasets.

Preprocessing - Code to create new hierarchy after flattening and rewiring.
TD-LR method:
- level_file_creation_td.m: creates level file from the cat_hier file. Useful for determining positive and negative examples of the node.

Level flattening method:
- newHierarchy_top_bottom_flattening.m: creates new hierarchy file after flattening top or bottom level.
- newHierarchy_multilevel13_flattening.m: creates new hierarchy file after flattening 1st and 3rd level. (this code has dependency on newHierarchy_top_bottom_flattening.m)
- level_file_creation_top_bottom_flattening.m: creates level file from cat_hier_tlf (top level flattened) or cat_hier_blf (bottom level flattened) file.
- level_file_creation_multilevel13_flattening.m: creates level file from cat_hier_mlf (multiple level flattened).

Proposed method:
- similarityMain.m: main function for computing the cosine similarity of the dataset
- simComputation.m: helper function for cosine similarity computation
- similarPairGrouping: groups most similar pairs
- fscore_computation_proposed_approach.m: creates the f*_n score for each node n.
- nodeLabels.m: creates one to one mapping of the node with f*_n value.
- inconsistentNodeSelection.m: selects inconsistent node for our proposed approach, Level-INR and Global-INR.
- newHierarchy_proposed_method.m: creates new hierarchy file after removing inconsistent node level-wise or globally.
- level_file_creation_proposed_approach.m: creates level file for proposed approach Level-INR or Global-INR.