共计 11028 个字符,预计需要花费 28 分钟才能阅读完成。
一直都听别人说 Hadoop, 还是蛮神秘的,不过看介绍才知道这个是整分布式的。现在分布式,大数据都是挺火的,不得不让人去凑个热闹呀。
先说下我的环境:
www.linuxidc.com@linuxidc-virtual:~/Downloads$ uname -a
Linux linuxidc-virtual 3.11.0-17-generic #31~precise1-Ubuntu SMP Tue Feb 4 21:29:23 UTC 2014 i686 i686 i386 GNU/Linux
准备工作,Hadoop 是 Apache 的产品, 你懂的, 这个当然和 Java 相关了, 所以你得有一个 Java 编译器才行, 不管你是 OpenJDK, 还是 OraceJDK, 你都要整一个不是.
参考:http://www.linuxidc.com/Linux/2014-03/97462.htm
好了,进入正题。
1, 我们得去 Hadoop 官网去下载, 我这里选择最新的 hadoop-2.3.0 版本, 其他版本记得甄别一下, 毕竟测试环境, 当然用最新的好了.
解压压缩包, 复制到制定的位置.
www.linuxidc.com@linuxidc-virtual:~/Downloads$ tar -xvf hadoop-2.3.0.tar.gz
www.linuxidc.com@linuxidc-virtual:~/Downloads$ sudo cp -r hadoop-2.3.0 /usr/local/hadoop/
2, 为啥, 我要说版本问题呢, 因为, 这个配置文件的位置, 之前的版本和现在版本有很大差异的.
当然版本配置文件地址在
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ ls -al
total 128
drwxr-xr-x 2 linuxidc linuxidc 4096 Feb 27 15:09 .
drwxr-xr-x 3 linuxidc linuxidc 4096 Feb 27 15:09 ..
-rw-r–r– 1 linuxidc linuxidc 3589 Feb 27 15:09 capacity-scheduler.xml
-rw-r–r– 1 linuxidc linuxidc 1335 Feb 27 15:09 configuration.xsl
-rw-r–r– 1 linuxidc linuxidc 318 Feb 27 15:09 container-executor.cfg
-rw-r–r– 1 linuxidc linuxidc 860 Feb 27 15:23 core-site.xml
-rw-r–r– 1 linuxidc linuxidc 3589 Feb 27 15:09 hadoop-env.cmd
-rw-r–r– 1 linuxidc linuxidc 3402 Feb 27 15:42 hadoop-env.sh
-rw-r–r– 1 linuxidc linuxidc 1774 Feb 27 15:09 hadoop-metrics2.properties
-rw-r–r– 1 linuxidc linuxidc 2490 Feb 27 15:09 hadoop-metrics.properties
-rw-r–r– 1 linuxidc linuxidc 9257 Feb 27 15:09 hadoop-policy.xml
-rw-r–r– 1 linuxidc linuxidc 984 Feb 27 15:27 hdfs-site.xml
-rw-r–r– 1 linuxidc linuxidc 1449 Feb 27 15:09 httpfs-env.sh
-rw-r–r– 1 linuxidc linuxidc 1657 Feb 27 15:09 httpfs-log4j.properties
-rw-r–r– 1 linuxidc linuxidc 21 Feb 27 15:09 httpfs-signature.secret
-rw-r–r– 1 linuxidc linuxidc 620 Feb 27 15:09 httpfs-site.xml
-rw-r–r– 1 linuxidc linuxidc 11169 Feb 27 15:09 log4j.properties
-rw-r–r– 1 linuxidc linuxidc 918 Feb 27 15:09 mapred-env.cmd
-rw-r–r– 1 linuxidc linuxidc 1383 Feb 27 15:09 mapred-env.sh
-rw-r–r– 1 linuxidc linuxidc 4113 Feb 27 15:09 mapred-queues.xml.template
-rw-r–r– 1 linuxidc linuxidc 758 Feb 27 15:09 mapred-site.xml.template
-rw-r–r– 1 linuxidc linuxidc 10 Feb 27 15:09 slaves
-rw-r–r– 1 linuxidc linuxidc 2316 Feb 27 15:09 ssl-client.xml.example
-rw-r–r– 1 linuxidc linuxidc 2268 Feb 27 15:09 ssl-server.xml.example
-rw-r–r– 1 linuxidc linuxidc 2178 Feb 27 15:09 yarn-env.cmd
-rw-r–r– 1 linuxidc linuxidc 4084 Feb 27 15:09 yarn-env.sh
-rw-r–r– 1 linuxidc linuxidc 772 Feb 27 15:30 yarn-site.xml
而我们要修改的配置文件有
hadoop-env.sh 找到 JAVA_HOME, 把它修改成这样.
# The java implementation to use.
2 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
core-site.xml
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ cat core-site.xml
<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2014-03/configuration.xsl”?>
<!–
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>
hdfs-site.xml
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ cat hdfs-site.xml
<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2014-03/configuration.xsl”?>
<!–
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/data</value>
</property>
</configuration>
相关阅读 :
Ubuntu 13.04 上搭建 Hadoop 环境 http://www.linuxidc.com/Linux/2013-06/86106.htm
Ubuntu 12.10 +Hadoop 1.2.1 版本集群配置 http://www.linuxidc.com/Linux/2013-09/90600.htm
Ubuntu 上搭建 Hadoop 环境(单机模式 + 伪分布模式)http://www.linuxidc.com/Linux/2013-01/77681.htm
Ubuntu 下 Hadoop 环境的配置 http://www.linuxidc.com/Linux/2012-11/74539.htm
单机版搭建 Hadoop 环境图文教程详解 http://www.linuxidc.com/Linux/2012-02/53927.htm
搭建 Hadoop 环境(在 Winodws 环境下用虚拟机虚拟两个 Ubuntu 系统进行搭建)http://www.linuxidc.com/Linux/2011-12/48894.htm
yarn-site.xml(这个就是新增加的,代替以前 mapred-site.xml)
www.linuxidc.com@linuxidc-virtual:/usr/local/Hadoop/etc/hadoop$ cat yarn-site.xml
<?xml version=”1.0″?>
<!–
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
<configuration>
<!– Site specific YARN configuration properties –>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
3, 更改 Hadoop 的权限, 这个是必须的, 因为启动的时候, 系统会叫你输入当前启动用户密码,我们现在都是在 root 用户上操作,这个可不成。
view sourceprint?1 www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ sudo chown -R linuxidc:linuxidc /usr/local/hadoop/
4, 现在基本配置就完成了, 现在就可以启动一下看看模样了.
进入 Hadoop 主目录 bin 下, 初始化 namenode.
1 www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/bin$ ./hadoop namenode -format
在 sbin 下启动 namenode 和 datanone.
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/sbin$ ./hadoop-daemon.sh start namenode
2
3 www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/sbin$ ./hadoop-daemon.sh start datanode
启动 Hadoop.
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/sbin$ ./start-all.sh
上图:
访问 http://172.16.80.228:50070/.
有开就关闭嘛.
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/sbin$ ./stop-all.sh
更多 Hadoop 相关信息见 Hadoop 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=13
一直都听别人说 Hadoop, 还是蛮神秘的,不过看介绍才知道这个是整分布式的。现在分布式,大数据都是挺火的,不得不让人去凑个热闹呀。
先说下我的环境:
www.linuxidc.com@linuxidc-virtual:~/Downloads$ uname -a
Linux linuxidc-virtual 3.11.0-17-generic #31~precise1-Ubuntu SMP Tue Feb 4 21:29:23 UTC 2014 i686 i686 i386 GNU/Linux
准备工作,Hadoop 是 Apache 的产品, 你懂的, 这个当然和 Java 相关了, 所以你得有一个 Java 编译器才行, 不管你是 OpenJDK, 还是 OraceJDK, 你都要整一个不是.
参考:http://www.linuxidc.com/Linux/2014-03/97462.htm
好了,进入正题。
1, 我们得去 Hadoop 官网去下载, 我这里选择最新的 hadoop-2.3.0 版本, 其他版本记得甄别一下, 毕竟测试环境, 当然用最新的好了.
解压压缩包, 复制到制定的位置.
www.linuxidc.com@linuxidc-virtual:~/Downloads$ tar -xvf hadoop-2.3.0.tar.gz
www.linuxidc.com@linuxidc-virtual:~/Downloads$ sudo cp -r hadoop-2.3.0 /usr/local/hadoop/
2, 为啥, 我要说版本问题呢, 因为, 这个配置文件的位置, 之前的版本和现在版本有很大差异的.
当然版本配置文件地址在
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ ls -al
total 128
drwxr-xr-x 2 linuxidc linuxidc 4096 Feb 27 15:09 .
drwxr-xr-x 3 linuxidc linuxidc 4096 Feb 27 15:09 ..
-rw-r–r– 1 linuxidc linuxidc 3589 Feb 27 15:09 capacity-scheduler.xml
-rw-r–r– 1 linuxidc linuxidc 1335 Feb 27 15:09 configuration.xsl
-rw-r–r– 1 linuxidc linuxidc 318 Feb 27 15:09 container-executor.cfg
-rw-r–r– 1 linuxidc linuxidc 860 Feb 27 15:23 core-site.xml
-rw-r–r– 1 linuxidc linuxidc 3589 Feb 27 15:09 hadoop-env.cmd
-rw-r–r– 1 linuxidc linuxidc 3402 Feb 27 15:42 hadoop-env.sh
-rw-r–r– 1 linuxidc linuxidc 1774 Feb 27 15:09 hadoop-metrics2.properties
-rw-r–r– 1 linuxidc linuxidc 2490 Feb 27 15:09 hadoop-metrics.properties
-rw-r–r– 1 linuxidc linuxidc 9257 Feb 27 15:09 hadoop-policy.xml
-rw-r–r– 1 linuxidc linuxidc 984 Feb 27 15:27 hdfs-site.xml
-rw-r–r– 1 linuxidc linuxidc 1449 Feb 27 15:09 httpfs-env.sh
-rw-r–r– 1 linuxidc linuxidc 1657 Feb 27 15:09 httpfs-log4j.properties
-rw-r–r– 1 linuxidc linuxidc 21 Feb 27 15:09 httpfs-signature.secret
-rw-r–r– 1 linuxidc linuxidc 620 Feb 27 15:09 httpfs-site.xml
-rw-r–r– 1 linuxidc linuxidc 11169 Feb 27 15:09 log4j.properties
-rw-r–r– 1 linuxidc linuxidc 918 Feb 27 15:09 mapred-env.cmd
-rw-r–r– 1 linuxidc linuxidc 1383 Feb 27 15:09 mapred-env.sh
-rw-r–r– 1 linuxidc linuxidc 4113 Feb 27 15:09 mapred-queues.xml.template
-rw-r–r– 1 linuxidc linuxidc 758 Feb 27 15:09 mapred-site.xml.template
-rw-r–r– 1 linuxidc linuxidc 10 Feb 27 15:09 slaves
-rw-r–r– 1 linuxidc linuxidc 2316 Feb 27 15:09 ssl-client.xml.example
-rw-r–r– 1 linuxidc linuxidc 2268 Feb 27 15:09 ssl-server.xml.example
-rw-r–r– 1 linuxidc linuxidc 2178 Feb 27 15:09 yarn-env.cmd
-rw-r–r– 1 linuxidc linuxidc 4084 Feb 27 15:09 yarn-env.sh
-rw-r–r– 1 linuxidc linuxidc 772 Feb 27 15:30 yarn-site.xml
而我们要修改的配置文件有
hadoop-env.sh 找到 JAVA_HOME, 把它修改成这样.
# The java implementation to use.
2 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
core-site.xml
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ cat core-site.xml
<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2014-03/configuration.xsl”?>
<!–
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>
hdfs-site.xml
www.linuxidc.com@linuxidc-virtual:/usr/local/hadoop/etc/hadoop$ cat hdfs-site.xml
<?xml version=”1.0″ encoding=”UTF-8″?>
<?xml-stylesheet type=”text/xsl” href=”https://www.linuxidc.com/Linux/2014-03/configuration.xsl”?>
<!–
Licensed under the Apache License, Version 2.0 (the “License”);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
–>
<!– Put site-specific property overrides in this file. –>
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/data</value>
</property>
</configuration>
相关阅读 :
Ubuntu 13.04 上搭建 Hadoop 环境 http://www.linuxidc.com/Linux/2013-06/86106.htm
Ubuntu 12.10 +Hadoop 1.2.1 版本集群配置 http://www.linuxidc.com/Linux/2013-09/90600.htm
Ubuntu 上搭建 Hadoop 环境(单机模式 + 伪分布模式)http://www.linuxidc.com/Linux/2013-01/77681.htm
Ubuntu 下 Hadoop 环境的配置 http://www.linuxidc.com/Linux/2012-11/74539.htm
单机版搭建 Hadoop 环境图文教程详解 http://www.linuxidc.com/Linux/2012-02/53927.htm
搭建 Hadoop 环境(在 Winodws 环境下用虚拟机虚拟两个 Ubuntu 系统进行搭建)http://www.linuxidc.com/Linux/2011-12/48894.htm