Substra / substra

Low-level Python library used to interact with a Substra network

Home Page:https://docs.substra.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hard drive full and crash of kubernetes

chrisalexandrepena opened this issue · comments

I've deployed the default substra cluster on a Minikube cluster hosted in a Ubuntu Server machine.
I used the tags listed on the compatibility table for substra 8.0 and it all went fine. Then after running a few (no more than 2) model trainings kubernetes stopped working because the hard drive was saturated which had never happened before (40Gb virtual hard drive).
When examining the file system it seems 2 main directories have been filled:

  • /tmp/hostpath-provisioner (13Gb)
  • /var/lib/docker/overlay2 (44Gb) which contains 557 folders, most small, but many pretty big:
300M	/var/lib/docker/overlay2/fa3ea6fc1178306c9767d69b431f8f1c09568d5cdfec520dd52028865ec8182d
330M	/var/lib/docker/overlay2/9dd078099719384b6f6cd6f1c729579021c6c35f90b8c459517fa88c8a176035
331M	/var/lib/docker/overlay2/9b5d5cf40927c02f779ca02801ef22b6bcc748d0767264bce4257a23f88214fb
356K	/var/lib/docker/overlay2/1ec0d6affdade188c73ab4c462a033d3da6968ddaf6714d66ab64198fa6408c8
375M	/var/lib/docker/overlay2/7cb67a313fa7473c7a6226956e89e80682c1e91810e06367806ae8de8722d51f
378M	/var/lib/docker/overlay2/5f3d2333d7da6eb7b09ad4d06cb986f51d0c5b7f539ad42cbe7caba3c9613841
382M	/var/lib/docker/overlay2/3a521bae19f5b1ad960464430ed71a0af19c9c5548a3d14fa3b438a373f4bede
391M	/var/lib/docker/overlay2/4dd31912dd19afc0ffd58afbde2df2263d0548709720a0ee90ac48f58ddf006e
391M	/var/lib/docker/overlay2/d440b002aff91e8fd25a7d068b1da6bdd352928f444a7aeff79760bca86f585b
392M	/var/lib/docker/overlay2/2f3511ffa5d41862cb621003a22416cc64bb60849859ba274fcfa81eb5925f1b
392M	/var/lib/docker/overlay2/48ba9814979f8b1da9e33b3500e64bc7f2e43866cc86682cbd6d615362085741
392M	/var/lib/docker/overlay2/4c860913012f299a3cfa4a536e4b7b05d6cc2b764222e55b250bf40ac6a4b493
392M	/var/lib/docker/overlay2/618749148b1792f1608be42ee040537b033843fe1bd37ac65f52440466a5e111
392M	/var/lib/docker/overlay2/74ae3c66c9af1ead2b0f864c44c5a4eca8ac7b2689c7c500ca6b2dfa70701a94
392M	/var/lib/docker/overlay2/aed8bce3002ab74f391b9ba25bf3fd457f6bb083e5372d1ef1ce0b95a4701ea1
392M	/var/lib/docker/overlay2/b09b1ed5475758836e57c1fc29b43d61f0e64839d56ab8f32664c92f69e9584c
392M	/var/lib/docker/overlay2/ce94a00dc9427dea998815bdf4ac575b7fbb7e567a2209c86f4160ec07c7340a
392M	/var/lib/docker/overlay2/fe97603748f0407f74c4018993271fb2ddc099784310cf1240b6f4e641659beb
393M	/var/lib/docker/overlay2/55762435fd36158ee24ed048c11d168b88b8ce5bd79f757fb582db4f2f99b945
402M	/var/lib/docker/overlay2/1fb07f4c9b24ba4eae5f51cb10fcd122d397b71819a38ba5843d9520a019bfc6
403M	/var/lib/docker/overlay2/c9aeb65f4ad33e527457f4459f13ab8c496b464c857907017ad71fec58ed1e9f
431M	/var/lib/docker/overlay2/64224c57d7e6f0488cc5f831f250fc9b7503c14aa6a95e76ce780168e1434149
615M	/var/lib/docker/overlay2/15484ab2e89d705ca71569463d83b427334cb885637ba127a68a476fdd5677d3
615M	/var/lib/docker/overlay2/1846d8518ad43d475e08e0f29418de100433e31b364d42bcd841fbf2daa7a4d7
615M	/var/lib/docker/overlay2/4a56f63375091b9df5935c040eb5b0bd2519c2e86dd081f973f94304713d6d0c
615M	/var/lib/docker/overlay2/69fc59fdf53b19d5d6667dd01106ebd02ff3b3029bb41dbabc4dc8c8c12f164a
615M	/var/lib/docker/overlay2/7dd5bce8d4a9fa263dd3892f14daefc83585c6db261122a6b253ef096fe5ab98
616M	/var/lib/docker/overlay2/0c3a5d542cbf9d550d5bcbadadf42bf7668f13e8ba8c21a78740723ac3d1a729
616M	/var/lib/docker/overlay2/3e95a167f911640a7ed03520221306c6af6c94b49fb97047a194eac422893df8
616M	/var/lib/docker/overlay2/49543f4224ec89b39cde3197ca1d225366538bdf392fe017db7c7a78ce45b020
616M	/var/lib/docker/overlay2/4b4b86062dfbdafe02ecdf4be8d75d3f141f77784075b3862579fa83350418b8
616M	/var/lib/docker/overlay2/50f8e92762bfed36347946c0a404f9657dccb59e6982860e1a0bc7c5c033eb3b
616M	/var/lib/docker/overlay2/5ea69a2301d23026bad3cbd63ac8ba22d274d4d344bcc1145bfa618609b8522c
616M	/var/lib/docker/overlay2/7c1792eedbc7582227a7c9c3489a35ad7299931e3732fa95f5866b16864cb1a7
616M	/var/lib/docker/overlay2/955ce888f8dbced2e9eb26c1744abb1b5d7471fc56a3ac7aa442dae1604f5f24
616M	/var/lib/docker/overlay2/a560f444dfc2a48eef54d97467a6c2bb69d9ee4ade2bc7adc8818926128c7e7a
616M	/var/lib/docker/overlay2/a8cb9583b9470c078c5308f47fc1279ea9ca94ad453c8fbdc42c6542a287270a
616M	/var/lib/docker/overlay2/b93b89be8107f4c26369124233b198042c72ef6de0a0ab17c786f1d9a73a6e06
617M	/var/lib/docker/overlay2/1ca30dfa825b6ffd2994620c30414c514a2d80ce97f21eb80e7fe90270aa95a5
617M	/var/lib/docker/overlay2/5207ad7398d32119b97c4a2e3198f603620d5e83d5bf5a60f239b1a1280cf47a
617M	/var/lib/docker/overlay2/9f730f4a6bb3b7cb57a8971ecb26db0f5ac68e51cf388732ef6d16eb391530af
617M	/var/lib/docker/overlay2/ca44ec1ccd837c762eca9f59b1f542aad0e8a5b1da0c4e2d51f59894da92c0e3
1.4G	/var/lib/docker/overlay2/1d2c3331e3ed53f68c8affaa269877c4cc945c225be6be8a22c14e31bb0e8515
1.4G	/var/lib/docker/overlay2/2fd9e0dfd21d841ea946a96fb6b879dcaf38012492dc1976717579dfd9aa6efd
1.4G	/var/lib/docker/overlay2/5ad3ad426f50847fcd6d425d03e213e546a67522d130b701d4ddcfd4ef9e6e36
1.4G	/var/lib/docker/overlay2/a46c393271f17ae50e1d83bc2dd2afc47abc4f261fee172b9de399e198d65c09
2.3G	/var/lib/docker/overlay2/10b0c1cba511ee95222a360c9fa18025e8e26e161e716197779e28cb36374823
2.3G	/var/lib/docker/overlay2/f7c8d8ee76bc6783431bfdb0a35d6696e650c3adf30e044e3b13a2a87036481a
3.7G	/var/lib/docker/overlay2/a5ac9d3a91aa391b508191a1dc647f9875cf883ab2ee98521e9c599b10eb7bb1

with the bulk of the biggest's data folder being structured like that:

1.4G	/var/lib/docker/overlay2/a5ac9d3a91aa391b508191a1dc647f9875cf883ab2ee98521e9c599b10eb7bb1/diff/kaniko
1.3G	/var/lib/docker/overlay2/a5ac9d3a91aa391b508191a1dc647f9875cf883ab2ee98521e9c599b10eb7bb1/diff/usr/local 
#usr/local which contains the python packages

The datasets we used were:

  • titanic (66kB csv)
  • mnist (63MB csv)
  • ham (104MB images and csv)
    so nothing too enormous...