julie@julie-HP-Pavilion-TS-14-Notebook-PC:~/Bureau/Big Data Frameworks$ sudo apt-get install filezilla
Lecture des listes de paquets... Fait
Construction de l'arbre des dépendances Lecture des informations d'état... Fait
Les paquets suivants ont été installés automatiquement et ne sont plus nécessaires :
gir1.2-geocodeglib-1.0 libfwup1 libllvm6.0 libllvm6.0:i386
linux-headers-4.15.0-46 linux-headers-4.15.0-46-generic
linux-image-4.15.0-46-generic linux-modules-4.15.0-46-generic
linux-modules-extra-4.15.0-46-generic ubuntu-web-launchers
Veuillez utiliser « sudo apt autoremove » pour les supprimer.
Les paquets supplémentaires suivants seront installés :
filezilla-common libfilezilla0 libpugixml1v5 libwxgtk3.0-0v5
Les NOUVEAUX paquets suivants seront installés :
filezilla filezilla-common libfilezilla0 libpugixml1v5 libwxgtk3.0-0v5
0 mis à jour, 5 nouvellement installés, 0 à enlever et 40 non mis à jour.
Il est nécessaire de prendre 4 182 ko/8 405 ko dans les archives.
Après cette opération, 35,0 Mo d'espace disque supplémentaires seront utilisés.Souhaitez-vous continuer ? [O/n] oRéception de :1 http://fr.archive.ubuntu.com/ubuntu bionic/universe amd64 libwxgtk3.0-0v5 amd64 3.0.4+dfsg-3 [4 182 kB]Réception de :1 http://fr.archive.ubuntu.com/ubuntu bionic/universe amd64 libwxgtk3.0-0v5 amd64 3.0.4+dfsg-3 [4 182 kB]2 837 ko réceptionnés en 3min 27s (13,7 ko/s) Sélection du paquet filezilla-common précédemment désélectionné.(Lecture de la base de données... 315373 fichiers et répertoires déjà installés.)Préparation du dépaquetage de .../filezilla-common_3.28.0-1_all.deb ...Dépaquetage de filezilla-common (3.28.0-1) ...Sélection du paquet libfilezilla0 précédemment désélectionné.Préparation du dépaquetage de .../libfilezilla0_0.11.0-1_amd64.deb ...Dépaquetage de libfilezilla0 (0.11.0-1) ...Sélection du paquet libpugixml1v5:amd64 précédemment désélectionné.Préparation du dépaquetage de .../libpugixml1v5_1.8.1-7_amd64.deb ...Dépaquetage de libpugixml1v5:amd64 (1.8.1-7) ...Sélection du paquet libwxgtk3.0-0v5:amd64 précédemment désélectionné.Préparation du dépaquetage de .../libwxgtk3.0-0v5_3.0.4+dfsg-3_amd64.deb ...Dépaquetage de libwxgtk3.0-0v5:amd64 (3.0.4+dfsg-3) ...Sélection du paquet filezilla précédemment désélectionné.Préparation du dépaquetage de .../filezilla_3.28.0-1_amd64.deb ...Dépaquetage de filezilla (3.28.0-1) ...Paramétrage de libpugixml1v5:amd64 (1.8.1-7) ...Paramétrage de libwxgtk3.0-0v5:amd64 (3.0.4+dfsg-3) ...Paramétrage de filezilla-common (3.28.0-1) ...Paramétrage de libfilezilla0 (0.11.0-1) ...Paramétrage de filezilla (3.28.0-1) ...Traitement des actions différées (« triggers ») pour desktop-file-utils (0.23-1ubuntu3.18.04.2) ...Traitement des actions différées (« triggers ») pour bamfdaemon (0.5.3+18.04.20180207.2-0ubuntu1) ...Rebuilding /usr/share/applications/bamf-2.index...Traitement des actions différées (« triggers ») pour libc-bin (2.27-3ubuntu1.4) ...Traitement des actions différées (« triggers ») pour man-db (2.8.3-2ubuntu0.1) ...Traitement des actions différées (« triggers ») pour gnome-menus (3.13.3-11ubuntu1.1) ...Traitement des actions différées (« triggers ») pour hicolor-icon-theme (0.17-2) ...Traitement des actions différées (« triggers ») pour mime-support (3.60ubuntu1) ...
Exception in thread "main" java.io.IOException: Error opening job jar: /home/julie.ngan/YARN_JAVA_MR/target/hadoop-examples-mapreduce-1.0-SNAPSHOT-jar-with-dependencies.jar
at org.apache.hadoop.util.RunJar.run(RunJar.java:261)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:228)
at java.util.zip.ZipFile.<init>(ZipFile.java:157)
at java.util.jar.JarFile.<init>(JarFile.java:169)
at java.util.jar.JarFile.<init>(JarFile.java:106)
at org.apache.hadoop.util.RunJar.run(RunJar.java:259)
... 1 more
[julie.ngan@hadoop-edge01 target]$ alias launch_job="yarn jar ~/YARN_JAVA_MR/target/hadoop-examples-mapreduce-1.0-SNAPSHOT-jar-with-dependencies.jar"
[julie.ngan@hadoop-edge01 target]$ launch_job wordcount trees.csv count
21/11/04 11:05:25 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/04 11:05:25 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/04 11:05:25 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636020325738, maxDate=1636625125738, sequenceNumber=6659, masterKeyId=78 on ha-hdfs:efrei
21/11/04 11:05:25 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636020325738, maxDate=1636625125738, sequenceNumber=6659, masterKeyId=78)
21/11/04 11:05:25 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/04 11:05:25 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_4551
21/11/04 11:05:27 INFO input.FileInputFormat: Total input files to process : 1
21/11/04 11:05:27 INFO mapreduce.JobSubmitter: number of splits:1
21/11/04 11:05:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_4551
21/11/04 11:05:27 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636020325738, maxDate=1636625125738, sequenceNumber=6659, masterKeyId=78)]
21/11/04 11:05:27 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/04 11:05:27 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/04 11:05:28 INFO impl.YarnClientImpl: Submitted application application_1630864376208_4551
21/11/04 11:05:28 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_4551/
21/11/04 11:05:28 INFO mapreduce.Job: Running job: job_1630864376208_4551
21/11/04 11:05:38 INFO mapreduce.Job: Job job_1630864376208_4551 running in uber mode :false
21/11/04 11:05:38 INFO mapreduce.Job: map 0% reduce 0%
21/11/04 11:05:46 INFO mapreduce.Job: map 100% reduce 0%
21/11/04 11:05:52 INFO mapreduce.Job: map 100% reduce 100%
21/11/04 11:05:52 INFO mapreduce.Job: Job job_1630864376208_4551 completed successfully
21/11/04 11:05:52 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=16561
FILE: Number of bytes written=559129
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=14251
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=20370
Total time spent by all reduces in occupied slots (ms)=10536
Total time spent by all map tasks (ms)=6790
Total time spent by all reduce tasks (ms)=2634
Total vcore-milliseconds taken by all map tasks=6790
Total vcore-milliseconds taken by all reduce tasks=2634
Total megabyte-milliseconds taken by all map tasks=10429440
Total megabyte-milliseconds taken by all reduce tasks=5394432
Map-Reduce Framework
Map input records=98
Map output records=1219
Map output bytes=21556
Map output materialized bytes=16561
Input split bytes=103
Combine input records=1219
Combine output records=579
Reduce input groups=579
Reduce shuffle bytes=16561
Reduce input records=579
Reduce output records=579
Spilled Records=1158
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=185
CPU time spent (ms)=2700
Physical memory (bytes) snapshot=1450426368
Virtual memory (bytes) snapshot=7281455104
Total committed heap usage (bytes)=1513095168
Peak Map Physical memory (bytes)=1157615616
Peak Map Virtual memory (bytes)=3401347072
Peak Reduce Physical memory (bytes)=292810752
Peak Reduce Virtual memory (bytes)=3880108032
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=14251
package com.opstty.mapper;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import javax.naming.Context;
import java.io.IOException;
public class TreesMapper extends Mapper<Object, Text, IntWritable, IntWritable> {
public int curr_line = 0;
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
if (curr_line != 0) {
context.write(new IntWritable(Integer.parseInt(value.toString().split(";")[1])), new IntWritable(1));
}
curr_line++;
}
}
TreesReducer.java
package com.opstty.reducer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
public class TreesReducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
public void reduce(IntWritable key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
DistinctDistricts.java
package com.opstty.job;
import com.opstty.mapper.TreesMapper;
import com.opstty.reducer.TreesReducer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class DistinctDistricts {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: distinctDistricts <in> [<in>...] <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "distinctDistricts");
job.setJarByClass(DistinctDistricts.class);
job.setMapperClass(TreesMapper.class);
job.setCombinerClass(TreesReducer.class);
job.setReducerClass(TreesReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
AppDriver.java
import com.opstty.job.DistinctDistricts;
import com.opstty.job.WordCount;
import org.apache.hadoop.util.ProgramDriver;
public class AppDriver {
public static void main(String argv[]) {
int exitCode = -1;
ProgramDriver programDriver = new ProgramDriver();
try {
programDriver.addClass("wordcount", WordCount.class,
"A map/reduce program that counts the words in the input files.");
programDriver.addClass("distinctDistricts", DistinctDistricts.class,
"A map/reduce program that returns the distinct districts of trees in a CSV.");
exitCode = programDriver.run(argv);
} catch (Throwable throwable) {
throwable.printStackTrace();
}
System.exit(exitCode);
}
}
Refresh the maven/jar
julie@julie-HP-Pavilion-TS-14-Notebook-PC:~/Bureau/Big Data Frameworks/YARN_JAVA_MR$ mvn clean package
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/usr/share/maven/lib/guice.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[INFO] Scanning for projects...
[INFO]
[INFO] ----------------< com.opstty:hadoop-examples-mapreduce >----------------
[INFO] Building hadoop-examples-mapreduce 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-examples-mapreduce ---
[INFO] Deleting /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-examples-mapreduce ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/src/main/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.6.1:compile (default-compile) @ hadoop-examples-mapreduce ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 7 source files to /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/target/classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-examples-mapreduce ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.6.1:testCompile (default-testCompile) @ hadoop-examples-mapreduce ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 2 source files to /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/target/test-classes
[INFO] /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/src/test/java/com/opstty/mapper/TokenizerMapperTest.java: Some input files use unchecked or unsafe operations.
[INFO] /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/src/test/java/com/opstty/mapper/TokenizerMapperTest.java: Recompile with -Xlint:unchecked for details.
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ hadoop-examples-mapreduce ---
[INFO] Surefire report directory: /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/target/surefire-reports
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running com.opstty.reducer.IntSumReducerTest
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.mockito.cglib.core.ReflectUtils$2 (file:/home/julie/.m2/repository/org/mockito/mockito-all/1.10.19/mockito-all-1.10.19.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of org.mockito.cglib.core.ReflectUtils$2
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.501 sec
Running com.opstty.mapper.TokenizerMapperTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
Results :
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hadoop-examples-mapreduce ---
[INFO] Building jar: /home/julie/Bureau/Big Data Frameworks/YARN_JAVA_MR/target/hadoop-examples-mapreduce-1.0-SNAPSHOT.jar
[INFO]
[INFO] --- maven-assembly-plugin:2.2-beta-5:single (default) @ hadoop-examples-mapreduce ---
[INFO] META-INF/ already added, skipping
[INFO] META-INF/MANIFEST.MF already added, skipping
21/11/06 19:43:06 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/06 19:43:06 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/06 19:43:06 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636224186565, maxDate=1636828986565, sequenceNumber=7472, masterKeyId=80 on ha-hdfs:efrei
21/11/06 19:43:06 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636224186565, maxDate=1636828986565, sequenceNumber=7472, masterKeyId=80)
21/11/06 19:43:06 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/06 19:43:06 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_4988
21/11/06 19:43:08 INFO input.FileInputFormat: Total input files to process : 1
21/11/06 19:43:08 INFO mapreduce.JobSubmitter: number of splits:1
21/11/06 19:43:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_4988
21/11/06 19:43:08 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636224186565, maxDate=1636828986565, sequenceNumber=7472, masterKeyId=80)]
21/11/06 19:43:08 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/06 19:43:08 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/06 19:43:08 INFO impl.YarnClientImpl: Submitted application application_1630864376208_4988
21/11/06 19:43:09 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_4988/
21/11/06 19:43:09 INFO mapreduce.Job: Running job: job_1630864376208_4988
21/11/06 19:43:19 INFO mapreduce.Job: Job job_1630864376208_4988 running in uber mode :false
21/11/06 19:43:19 INFO mapreduce.Job: map 0% reduce 0%
21/11/06 19:43:27 INFO mapreduce.Job: map 100% reduce 0%
21/11/06 19:43:32 INFO mapreduce.Job: map 100% reduce 100%
21/11/06 19:43:32 INFO mapreduce.Job: Job job_1630864376208_4988 completed successfully
21/11/06 19:43:32 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=176
FILE: Number of bytes written=526385
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=80
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=18921
Total time spent by all reduces in occupied slots (ms)=9988
Total time spent by all map tasks (ms)=6307
Total time spent by all reduce tasks (ms)=2497
Total vcore-milliseconds taken by all map tasks=6307
Total vcore-milliseconds taken by all reduce tasks=2497
Total megabyte-milliseconds taken by all map tasks=9687552
Total megabyte-milliseconds taken by all reduce tasks=5113856
Map-Reduce Framework
Map input records=98
Map output records=97
Map output bytes=776
Map output materialized bytes=176
Input split bytes=103
Combine input records=97
Combine output records=17
Reduce input groups=17
Reduce shuffle bytes=176
Reduce input records=17
Reduce output records=17
Spilled Records=34
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=192
CPU time spent (ms)=2300
Physical memory (bytes) snapshot=1447424000
Virtual memory (bytes) snapshot=7286034432
Total committed heap usage (bytes)=1510473728
Peak Map Physical memory (bytes)=1157574656
Peak Map Virtual memory (bytes)=3404156928
Peak Reduce Physical memory (bytes)=289849344
Peak Reduce Virtual memory (bytes)=3881877504
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=80
Check the result
[julie.ngan@hadoop-edge01 ~]$ alias result='function _result() { hdfs dfs -cat "$1"/part-r-00000; } ; _result'
[julie.ngan@hadoop-edge01 ~]$ result districts
21/11/09 19:48:28 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 19:48:29 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 19:48:29 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636483709216, maxDate=1637088509216, sequenceNumber=7686, masterKeyId=85 on ha-hdfs:efrei
21/11/09 19:48:29 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636483709216, maxDate=1637088509216, sequenceNumber=7686, masterKeyId=85)
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://efrei/user/julie.ngan/species already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1565)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1562)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1583)
at com.opstty.job.Species.main(Species.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at com.opstty.AppDriver.main(AppDriver.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
[julie.ngan@hadoop-edge01 ~]$ launch_job treeSpecies trees.csv species2
21/11/09 19:48:37 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 19:48:38 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 19:48:38 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636483718230, maxDate=1637088518230, sequenceNumber=7687, masterKeyId=85 on ha-hdfs:efrei
21/11/09 19:48:38 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636483718230, maxDate=1637088518230, sequenceNumber=7687, masterKeyId=85)
21/11/09 19:48:38 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/09 19:48:38 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_5109
21/11/09 19:48:39 INFO input.FileInputFormat: Total input files to process : 1
21/11/09 19:48:39 INFO mapreduce.JobSubmitter: number of splits:1
21/11/09 19:48:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_5109
21/11/09 19:48:39 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636483718230, maxDate=1637088518230, sequenceNumber=7687, masterKeyId=85)]
21/11/09 19:48:39 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/09 19:48:39 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/09 19:48:40 INFO impl.YarnClientImpl: Submitted application application_1630864376208_5109
21/11/09 19:48:40 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_5109/
21/11/09 19:48:40 INFO mapreduce.Job: Running job: job_1630864376208_5109
21/11/09 19:48:49 INFO mapreduce.Job: Job job_1630864376208_5109 running in uber mode :false
21/11/09 19:48:49 INFO mapreduce.Job: map 0% reduce 0%
21/11/09 19:48:58 INFO mapreduce.Job: map 100% reduce 0%
21/11/09 19:49:03 INFO mapreduce.Job: map 100% reduce 100%
21/11/09 19:49:03 INFO mapreduce.Job: Job job_1630864376208_5109 completed successfully
21/11/09 19:49:03 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=547
FILE: Number of bytes written=527735
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=451
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=20013
Total time spent by all reduces in occupied slots (ms)=10816
Total time spent by all map tasks (ms)=6671
Total time spent by all reduce tasks (ms)=2704
Total vcore-milliseconds taken by all map tasks=6671
Total vcore-milliseconds taken by all reduce tasks=2704
Total megabyte-milliseconds taken by all map tasks=10246656
Total megabyte-milliseconds taken by all reduce tasks=5537792
Map-Reduce Framework
Map input records=98
Map output records=97
Map output bytes=995
Map output materialized bytes=547
Input split bytes=103
Combine input records=97
Combine output records=45
Reduce input groups=45
Reduce shuffle bytes=547
Reduce input records=45
Reduce output records=45
Spilled Records=90
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=178
CPU time spent (ms)=2640
Physical memory (bytes) snapshot=1449582592
Virtual memory (bytes) snapshot=7281795072
Total committed heap usage (bytes)=1506803712
Peak Map Physical memory (bytes)=1156997120
Peak Map Virtual memory (bytes)=3403120640
Peak Reduce Physical memory (bytes)=292585472
Peak Reduce Virtual memory (bytes)=3878674432
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=451
21/11/09 20:55:28 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 20:55:28 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 20:55:28 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636487728747, maxDate=1637092528747, sequenceNumber=7704, masterKeyId=85 on ha-hdfs:efrei
21/11/09 20:55:28 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636487728747, maxDate=1637092528747, sequenceNumber=7704, masterKeyId=85)
21/11/09 20:55:28 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/09 20:55:28 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_5122
21/11/09 20:55:30 INFO input.FileInputFormat: Total input files to process : 1
21/11/09 20:55:30 INFO mapreduce.JobSubmitter: number of splits:1
21/11/09 20:55:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_5122
21/11/09 20:55:30 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636487728747, maxDate=1637092528747, sequenceNumber=7704, masterKeyId=85)]
21/11/09 20:55:30 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/09 20:55:30 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/09 20:55:30 INFO impl.YarnClientImpl: Submitted application application_1630864376208_5122
21/11/09 20:55:31 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_5122/
21/11/09 20:55:31 INFO mapreduce.Job: Running job: job_1630864376208_5122
21/11/09 20:55:41 INFO mapreduce.Job: Job job_1630864376208_5122 running in uber mode :false
21/11/09 20:55:41 INFO mapreduce.Job: map 0% reduce 0%
21/11/09 20:55:50 INFO mapreduce.Job: map 100% reduce 0%
21/11/09 20:55:55 INFO mapreduce.Job: map 100% reduce 100%
21/11/09 20:55:55 INFO mapreduce.Job: Job job_1630864376208_5122 completed successfully
21/11/09 20:55:55 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=539
FILE: Number of bytes written=527753
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=390
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=20478
Total time spent by all reduces in occupied slots (ms)=10448
Total time spent by all map tasks (ms)=6826
Total time spent by all reduce tasks (ms)=2612
Total vcore-milliseconds taken by all map tasks=6826
Total vcore-milliseconds taken by all reduce tasks=2612
Total megabyte-milliseconds taken by all map tasks=10484736
Total megabyte-milliseconds taken by all reduce tasks=5349376
Map-Reduce Framework
Map input records=98
Map output records=97
Map output bytes=1223
Map output materialized bytes=539
Input split bytes=103
Combine input records=97
Combine output records=36
Reduce input groups=36
Reduce shuffle bytes=539
Reduce input records=36
Reduce output records=36
Spilled Records=72
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=184
CPU time spent (ms)=2500
Physical memory (bytes) snapshot=1446887424
Virtual memory (bytes) snapshot=7286673408
Total committed heap usage (bytes)=1515192320
Peak Map Physical memory (bytes)=1155088384
Peak Map Virtual memory (bytes)=3404017664
Peak Reduce Physical memory (bytes)=291799040
Peak Reduce Virtual memory (bytes)=3882655744
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=390
1.8.5 Sort the trees height from smallest to largest
[julie.ngan@hadoop-edge01 ~]$ launch_job sortedHeight trees.csv sortedHeight2
21/11/09 21:19:05 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 21:19:05 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 21:19:05 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489145474, maxDate=1637093945474, sequenceNumber=7709, masterKeyId=85 on ha-hdfs:efrei
21/11/09 21:19:05 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489145474, maxDate=1637093945474, sequenceNumber=7709, masterKeyId=85)
21/11/09 21:19:05 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/09 21:19:05 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_5126
21/11/09 21:19:06 INFO input.FileInputFormat: Total input files to process : 1
21/11/09 21:19:07 INFO mapreduce.JobSubmitter: number of splits:1
21/11/09 21:19:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_5126
21/11/09 21:19:07 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489145474, maxDate=1637093945474, sequenceNumber=7709, masterKeyId=85)]
21/11/09 21:19:07 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/09 21:19:07 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/09 21:19:07 INFO impl.YarnClientImpl: Submitted application application_1630864376208_5126
21/11/09 21:19:07 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_5126/
21/11/09 21:19:07 INFO mapreduce.Job: Running job: job_1630864376208_5126
21/11/09 21:19:16 INFO mapreduce.Job: Job job_1630864376208_5126 running in uber mode :false
21/11/09 21:19:16 INFO mapreduce.Job: map 0% reduce 0%
21/11/09 21:19:27 INFO mapreduce.Job: map 100% reduce 0%
21/11/09 21:19:36 INFO mapreduce.Job: map 100% reduce 100%
21/11/09 21:19:36 INFO mapreduce.Job: Job job_1630864376208_5126 completed successfully
21/11/09 21:19:36 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=4100
FILE: Number of bytes written=535273
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=3994
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=20775
Total time spent by all reduces in occupied slots (ms)=26736
Total time spent by all map tasks (ms)=6925
Total time spent by all reduce tasks (ms)=6684
Total vcore-milliseconds taken by all map tasks=6925
Total vcore-milliseconds taken by all reduce tasks=6684
Total megabyte-milliseconds taken by all map tasks=10636800
Total megabyte-milliseconds taken by all reduce tasks=13688832
Map-Reduce Framework
Map input records=98
Map output records=96
Map output bytes=3902
Map output materialized bytes=4100
Input split bytes=103
Combine input records=0
Combine output records=0
Reduce input groups=28
Reduce shuffle bytes=4100
Reduce input records=96
Reduce output records=96
Spilled Records=192
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=213
CPU time spent (ms)=2690
Physical memory (bytes) snapshot=1453654016
Virtual memory (bytes) snapshot=7284088832
Total committed heap usage (bytes)=1498415104
Peak Map Physical memory (bytes)=1159409664
Peak Map Virtual memory (bytes)=3403243520
Peak Reduce Physical memory (bytes)=294244352
Peak Reduce Virtual memory (bytes)=3880845312
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=3994
[julie.ngan@hadoop-edge01 ~]$ launch_job districtOldestTree trees.csv districtOldestTree2
21/11/09 21:21:10 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 21:21:10 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 21:21:11 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489271005, maxDate=1637094071005, sequenceNumber=7711, masterKeyId=85 on ha-hdfs:efrei
21/11/09 21:21:11 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489271005, maxDate=1637094071005, sequenceNumber=7711, masterKeyId=85)
21/11/09 21:21:11 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/09 21:21:11 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_5128
21/11/09 21:21:12 INFO input.FileInputFormat: Total input files to process : 1
21/11/09 21:21:12 INFO mapreduce.JobSubmitter: number of splits:1
21/11/09 21:21:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_5128
21/11/09 21:21:12 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636489271005, maxDate=1637094071005, sequenceNumber=7711, masterKeyId=85)]
21/11/09 21:21:12 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/09 21:21:12 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/09 21:21:12 INFO impl.YarnClientImpl: Submitted application application_1630864376208_5128
21/11/09 21:21:12 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_5128/
21/11/09 21:21:12 INFO mapreduce.Job: Running job: job_1630864376208_5128
21/11/09 21:21:23 INFO mapreduce.Job: Job job_1630864376208_5128 running in uber mode :false
21/11/09 21:21:23 INFO mapreduce.Job: map 0% reduce 0%
21/11/09 21:21:31 INFO mapreduce.Job: map 100% reduce 0%
21/11/09 21:21:37 INFO mapreduce.Job: map 100% reduce 100%
21/11/09 21:21:37 INFO mapreduce.Job: Job job_1630864376208_5128 completed successfully
21/11/09 21:21:37 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=1315
FILE: Number of bytes written=529763
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=7
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=19392
Total time spent by all reduces in occupied slots (ms)=12116
Total time spent by all map tasks (ms)=6464
Total time spent by all reduce tasks (ms)=3029
Total vcore-milliseconds taken by all map tasks=6464
Total vcore-milliseconds taken by all reduce tasks=3029
Total megabyte-milliseconds taken by all map tasks=9928704
Total megabyte-milliseconds taken by all reduce tasks=6203392
Map-Reduce Framework
Map input records=98
Map output records=77
Map output bytes=1155
Map output materialized bytes=1315
Input split bytes=103
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=1315
Reduce input records=77
Reduce output records=1
Spilled Records=154
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=215
CPU time spent (ms)=2510
Physical memory (bytes) snapshot=1452765184
Virtual memory (bytes) snapshot=7281790976
Total committed heap usage (bytes)=1503657984
Peak Map Physical memory (bytes)=1158332416
Peak Map Virtual memory (bytes)=3401986048
Peak Reduce Physical memory (bytes)=294432768
Peak Reduce Virtual memory (bytes)=3879804928
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=7
1.8.7 District containing the most trees : 2 MapReduce (so 2 Jobs ???) EN COURS...
First MapReduce : just like the distinctDistricts mr job
[julie.ngan@hadoop-edge01 ~]$ launch_job districtMostTrees trees.csv districtMostTrees5
21/11/09 22:11:30 INFO impl.TimelineReaderClientImpl: Initialized TimelineReader URI=https://hadoop-master03.efrei.online:8199/ws/v2/timeline/, clusterId=yarn-cluster
21/11/09 22:11:30 INFO client.AHSProxy: Connecting to Application History server at hadoop-master03.efrei.online/163.172.102.23:10200
21/11/09 22:11:30 INFO hdfs.DFSClient: Created token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636492290877, maxDate=1637097090877, sequenceNumber=7722, masterKeyId=85 on ha-hdfs:efrei
21/11/09 22:11:30 INFO security.TokenCache: Got dt for hdfs://efrei; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636492290877, maxDate=1637097090877, sequenceNumber=7722, masterKeyId=85)
21/11/09 22:11:30 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
21/11/09 22:11:31 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/julie.ngan/.staging/job_1630864376208_5137
21/11/09 22:11:31 INFO input.FileInputFormat: Total input files to process : 1
21/11/09 22:11:31 INFO mapreduce.JobSubmitter: number of splits:1
21/11/09 22:11:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1630864376208_5137
21/11/09 22:11:32 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:efrei, Ident: (token for julie.ngan: HDFS_DELEGATION_TOKEN owner=julie.ngan@EFREI.ONLINE, renewer=yarn, realUser=, issueDate=1636492290877, maxDate=1637097090877, sequenceNumber=7722, masterKeyId=85)]
21/11/09 22:11:32 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/1.0.3.0-223/0/resource-types.xml
21/11/09 22:11:32 INFO impl.TimelineClientImpl: Timeline service address: hadoop-master03.efrei.online:8190
21/11/09 22:11:32 INFO impl.YarnClientImpl: Submitted application application_1630864376208_5137
21/11/09 22:11:32 INFO mapreduce.Job: The url to track the job: https://hadoop-master02.efrei.online:8090/proxy/application_1630864376208_5137/
21/11/09 22:11:32 INFO mapreduce.Job: Running job: job_1630864376208_5137
21/11/09 22:11:43 INFO mapreduce.Job: Job job_1630864376208_5137 running in uber mode :false
21/11/09 22:11:43 INFO mapreduce.Job: map 0% reduce 0%
21/11/09 22:11:51 INFO mapreduce.Job: map 100% reduce 0%
21/11/09 22:12:00 INFO mapreduce.Job: map 100% reduce 100%
21/11/09 22:12:00 INFO mapreduce.Job: Job job_1630864376208_5137 completed successfully
21/11/09 22:12:00 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=176
FILE: Number of bytes written=527097
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=16783
HDFS: Number of bytes written=80
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=19878
Total time spent by all reduces in occupied slots (ms)=25700
Total time spent by all map tasks (ms)=6626
Total time spent by all reduce tasks (ms)=6425
Total vcore-milliseconds taken by all map tasks=6626
Total vcore-milliseconds taken by all reduce tasks=6425
Total megabyte-milliseconds taken by all map tasks=10177536
Total megabyte-milliseconds taken by all reduce tasks=13158400
Map-Reduce Framework
Map input records=98
Map output records=97
Map output bytes=776
Map output materialized bytes=176
Input split bytes=103
Combine input records=97
Combine output records=17
Reduce input groups=17
Reduce shuffle bytes=176
Reduce input records=17
Reduce output records=17
Spilled Records=34
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=192
CPU time spent (ms)=2220
Physical memory (bytes) snapshot=1453723648
Virtual memory (bytes) snapshot=7284150272
Total committed heap usage (bytes)=1501036544
Peak Map Physical memory (bytes)=1160740864
Peak Map Virtual memory (bytes)=3403567104
Peak Reduce Physical memory (bytes)=292982784
Peak Reduce Virtual memory (bytes)=3880583168
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=16680
File Output Format Counters
Bytes Written=80