PlatinumEssays.com - Free Essays, Term Papers, Research Papers and Book Reports
Search

Medicare Fraud Detection Using Open Source Data

By:   •  March 17, 2019  •  Essay  •  517 Words (3 Pages)  •  793 Views

Page 1 of 3

hadoop install

steps and nececry files for installing hadoop + yarn 2.6 on ubuntu 14.10 (from http://releases.ubuntu.com/14.10/ubuntu-14.10-desktop-amd64.iso)

I collected many instructions as I could (see the refs below) but select the steps I like and put them here (It is kind of like cherry pick). Those steps are tested on my hadoop cluster. It works perfect. Three big steps: install packages and config them and hadoop xml files. I used tmux with the function of synchronize-panes for setting all the machines.

##machines

  • pocoyo-1 192.168.1.72 (master)
  • pocoyo-2 192.168.1.52 (data node)
  • pocoyo-3 192.168.1.44 (data node)

edit host

  • vi /etc/hostname
  • check machine name, for each machine, for example, you can modify them if you want
  • pocoyo-1
  • sudo vi /etc/hosts
  • add folowing lines, for each machine or use scp to others
  • 127.0.0.1 localhost
  • 192.168.1.72 pocoyo-1 # nameNode
  • 192.168.1.52 pocoyo-2 # secondary namdNode

192.168.1.44 pocoyo-3  # data node

*sudo scp 192.168.1.72:/etc/hosts /etc/hosts (run this on slaves)

##creat hadoop user and user group for each machine

  • sudo addgroup hadoop
  • sudo adduser --ingroup hadoop hduser
  • sudo adduser hduser sudo
  • sudo chown -R hduser:hadoop /usr/local/

##install ssh for each machine (the following is not a secure way but it faster for test purpose)

  • su - hduser
  • sudo apt-get intall openssh-server
  • ssh localhost

on master

  • ssh-keygen -t rsa -P ""
  • cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

on slaves (pocoyo-2 and pocoyo-3)

  • mkdir .ssh

on master

  • ssh-copy-id hduser@pocoyo-2 (do the same for pocoyo-3)
  • ssh hduser@pocoyo-2
  • ssh hduser@pocoyo-3

##disable ipv6 for each machine (:setw synchronize-panes in tmux worked for me)

  • sudo vi /etc/sysctl.conf
  • add following lines
  • net.ipv6.conf.all.disable_ipv6 = 1
  • net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

##### run

* sudo service networking restart

##download hadoop for each machine

(once one dowloaded you can use scp to copy to others)

* su - hduser

* cd /usr/local

* wget http://mirror.reverse.net/pub/apache/hadoop/common/stable2/hadoop-2.6.0.tar.gz

* tar -xzf hadoop-2.6.0.tar.gz

* ln -s /usr/local/hadoop-2.6.0 /usr/local/hadoop

##install java 1.7 for all machines.

(once one dowloaded you can use scp to copy to others)

we select 1.7 because it is reported on http://wiki.apache.org/hadoop/HadoopJavaVersions

* su - hduser

* cd cd /usr/local

* wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jdk-7u75-linux-x64.tar.gz"

* tar -xzf jdk-7u75-linux-x64.tar.gz

* ln -s /usr/local/jdk-7u75-linux-x64 /usr/local/jdk

## edit /etc/profile for master

(:setw synchronize-panes in tmux worked for me)

* sudo vi /etc/profile

  * add following lines

```sh

  export HADOOP_HOME=/usr/local/hadoop

  export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

  export JAVA_HOME=/usr/local/jdk

  export CLASSPATH=$JAVA_HOME/lib/tools.jar

  export PATH=$JAVA_HOME/bin:$PATH

  • source /etc/profile
  • java -version ( to test)

on slaves

  • sudo scp hduser@pocoyo-1:/etc/profile /etc/profile
  • source /etc/profile

##config hadoop xml files.

modify $HADOOP_HOME/etc/hadoop/hadoop-env.sh for all machines add

  • export JAVA_HOME=/usr/local/jdk

modify $HADOOP_HOME/etc/hadoop/slaves for all machines

  • add

  pocoyo-1

  pocoyo-2

  pocoyo-3

copy xml files first

  • cd $HADOOP_HOME
  • cp ./share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml ./etc/hadoop/core-site.xml
  • cp ./share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml ./etc/hadoop/hdfs-site.xml
  • cp ./share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/yarn-default.xml ./etc/hadoop/yarn-site.xml
  • cp ./share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml ./etc/hadoop/mapred-site.xml

modify core-site.xml

property

value

machines

fs.defaultFS

hdfs://pocoyo-1:9000

all

hadoop.tmp.dir

/usr/local/hadoop/tmp

all

io.file.buffer.size

131072

all

modify hdfs-site.xml

property

value

machines

dfs.namenode.rpc-address

pocoyo-1:9001

all

dfs.namenode.secondary.http-address

pocoyo-2:50090

namenode and seconday nameNode

dfs.namenode.name.dir

/usr/local/hadoop/dfs/name

namenode and seconday nameNode

dfs.datanode.data.dir

/usr/local/hadoop/data

datanodes

modify mapred-site.xml

...

Download:  txt (6.1 Kb)   pdf (201.4 Kb)   docx (158.9 Kb)  
Continue for 2 more pages »