1） Get catch of Hadoop/Yarn Again. Before I graduated from ICT, I had spent lots of time in researching Distributed computing framework, then due to my first job as to HBase, they were set aside. Now when I get back to the subject, I feel very excited and a little unfamiliar. Hadoop Security, Yarn log policy, Container lifetime, etc.
2) Acquire a lot of practice how to maintain or manager large number of normal users, especially in multi-tenant data-sharing environment.
3) Touch and become familiar with some other components in hadoop-ecosystem.
Oozie – a very useful work flow and coordinator, it can help user to make action when condition can be specific or defined.
Hive – a sql-based data warehouse upon hadoop, which can create a table upon read-only hdfs directory.
Pig – a order-based big data processing script, which gives more flexibility and extension for end-user.
These days, I found my colleagues in US and India were very diligent and hard-working, working with them lets me convince that if you want to do better, you need to struggle again and again.
I will try to learn more about English and Distributed Computing knowledge, hopefully I can finish much better patches for Apache.