Try to outperform random forest AL with deep AL (TensorFlow 2.0)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
Garrett Wilson 39a0a5c416 comment tweaks 1 year ago
datasets for generalization train D multiple times, foldX in name, eval regex 1 year ago
preprocessing skip simple dataset 1 year ago
.gitignore custom CheckpointManager to save latest and best model 1 year ago
README.md argument changes, dropout in [0,0.5] 1 year ago
al.py Average/stdev results of AL 2 years ago
al_results.sh reworded output 2 years ago
balance.sh Balance classes 2 years ago
checkpoints.py {latest,best}_step() in CheckpointManager 1 year ago
compress.sh for generalization train D multiple times, foldX in name, eval regex 1 year ago
dal.py tf.function for train_step 1 year ago
dal_cv.sh train on Kamiak 2 years ago
dal_eval.py fixed dal_eval.py for max_examples changes 1 year ago
dal_results.sh last model eval option, results from multiple Kamiak models 2 years ago
download_datasets.sh updated README, added script to download datasets 2 years ago
file_utils.py more general comment 1 year ago
generate_datasets.sh Another simple feature set 2 years ago
generate_tfrecord.py tf_upgrade_v2 script 1 year ago
hyperparameter_tuning_commands.py argument changes, dropout in [0,0.5] 1 year ago
hyperparameter_tuning_nevergrad.py argument changes, dropout in [0,0.5] 1 year ago
hyperparameter_tuning_results.py output hyperparameter name 1 year ago
kamiak_check_errors.sh if part of a job fails then fail the whole job 2 years ago
kamiak_config.sh added kamiak training scripts 2 years ago
kamiak_download.sh eval script for Kamiak 2 years ago
kamiak_eval.srun argument changes, dropout in [0,0.5] 1 year ago
kamiak_hyperparam.srun working Nevergrad hyperparam optimization 1 year ago
kamiak_queue_all.sh argument changes, dropout in [0,0.5] 1 year ago
kamiak_queue_all_folds.sh argument changes, dropout in [0,0.5] 1 year ago
kamiak_queue_all_targets.sh argument changes, dropout in [0,0.5] 1 year ago
kamiak_tflogs.sh last model eval option, results from multiple Kamiak models 2 years ago
kamiak_train.srun argument changes, dropout in [0,0.5] 1 year ago
kamiak_train_single.srun argument changes, dropout in [0,0.5] 1 year ago
kamiak_upload.sh process windows on Kamiak 2 years ago
kamiak_windows.srun added Kamiak partition 1 year ago
load_data.py put back tf_domain_labels 1 year ago
metrics.py comment tweaks 1 year ago
models.py comment tweaks 1 year ago
pickle_data.py logging with absl 1 year ago
pool.py generate tfrecord in parallel 2 years ago
tensorboard_some.sh tensorboard sample of homes 2 years ago

README.md

Deep Activity Learning (DAL)

Goals:

  • Compare random forest activity learning (AL) with deep activity learning
  • Compare previous AL features with simpler feature vector
  • Add domain adaptation and generalization to deep network for further improvement

Steps:

  • Preprocess data extracting the desired features, creating time-series windows, and cross validation train/test splits (see generate_datasets.sh, preprocessing/, etc.)
  • Run and compare AL (al.py) and DAL (dal.py) on the datasets

Datasets

This is designed to work on smart home datasets in the formats of those on the CASAS website. To download some smart home data to preprocessing/orig, convert into the appropriate annotated format, and output to preprocessing/raw, run:

./download_datasets.sh

Preprocessing

To apply activity label and sensor translations, generate the desired feature representations and time-series windows, and create the .tfrecord files:

./generate_datasets.sh

Note: a lot of the Bash scripts use my multi-threading/processing script /scripts/threading to drastically speed up the preprocessing, so you'll want to either remove those statements or download the script.

Running

AL

To train AL (uses random forests) and compute the results:

./al_results.sh

DAL

If running locally on your computer, run the cross validation training:

./dal_cv.sh

If running on a cluster (after editing kamiak_config.sh):

./kamiak_upload.sh
./kamiak_queue_all.sh flat --dataset=al.zip --features=al --model=flat
./kamiak_queue_all.sh flat-da --dataset=al.zip --features=al --model=flat --adapt

# on your computer
./kamiak_tflogs.sh # during training, to download the logs/models

Then, to pick the best models based on the validation results above (unless using domain adaptation, then pick the last model) and evaluate on the entire train and test sets for comparison with AL:

./dal_results.sh flat --features=al # set "from" to either kamiak or cv
./dal_results.sh flat-da --features=al --last