如何讓HDFS中的Java和PythonAPI接口連接

今天就跟大家聊聊有關如何讓HDFS中的Java和Python API接口連接，可能很多人都不太了解，為了讓大家更加了解，小編給大家總結了以下內容，希望大家根據這篇文章可以有所收獲。

成都創(chuàng)新互聯專注為客戶提供全方位的互聯網綜合服務，包含不限于成都做網站、成都網站制作、寬甸網絡推廣、小程序定制開發(fā)、寬甸網絡營銷、寬甸企業(yè)策劃、寬甸品牌公關、搜索引擎seo、人物專訪、企業(yè)宣傳片、企業(yè)代運營等，從售前售中售后，我們都將竭誠為您服務，您的肯定，是我們最大的嘉獎；成都創(chuàng)新互聯為所有大學生創(chuàng)業(yè)者提供寬甸建站搭建服務，24小時服務熱線：13518219792，官方網址：muchs.cn

現在進入HDFS中的Java和Python的API操作，后面可能介紹Scala的相關的。

在講Java API之前介紹一下使用的IDE——IntelliJ IDEA ，我本人使用的是2020.3 x64的社區(qū)版本。

Java API

創(chuàng)建maven工程，關于Maven的配置，在IDEA中，Maven下載源必須配置成阿里云。

如何讓HDFS中的Java和Python API接口連接

在對應的D:\apache-maven-3.8.1-bin\apache-maven-3.8.1\conf\settings.xml需要設置阿里云的下載源。

下面創(chuàng)建maven工程，添加常見的依賴

如何讓HDFS中的Java和Python API接口連接

添加hadoop-client依賴，版本最好和hadoop指定的一致，并添加junit單元測試依賴。

<dependencies>   <dependency>         <groupId>org.apache.hadoop</groupId>         <artifactId>hadoop-common</artifactId>         <version>3.1.4</version>   </dependency>   <dependency>         <groupId>org.apache.hadoop</groupId>         <artifactId>hadoop-hdfs</artifactId>         <version>3.1.4</version>   </dependency>   <dependency>       <groupId>org.apache.hadoop</groupId>       <artifactId>hadoop-client</artifactId>       <version>3.1.4</version>   </dependency>   <dependency>       <groupId>junit</groupId>       <artifactId>junit</artifactId>       <version>4.11</version>   </dependency> </dependencies>

HDFS文件上傳

在這里編寫測試類即可，新建一個java文件：main.java

這里的FileSyste一開始是本地的文件系統，需要初始化為HDFS的文件系統

import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.junit.Test; import java.net.URI; public class main {      @Test     public void testPut() throws Exception {         //   獲取FileSystem類的方法有很多種，這里只寫一種(比較常用的是使URI)         Configuration configuration = new Configuration();         // user是Hadoop集群的賬號，連接端口默認9000         FileSystem fileSystem = FileSystem.get(                 new URI("hdfs://192.168.147.128:9000"),                 configuration,                 "hadoop");         // 將f:/stopword.txt 上傳到 /user/stopword.txt         fileSystem.copyFromLocalFile(                 new Path("f:/stopword.txt"), new Path("/user/stopword.txt"));         fileSystem.close();     } }

在對應的HDFS中，就會看見我剛剛上傳的機器學習相關的停用詞。

HDFS文件下載

由于每次都需要初始化FileSystem，比較懶的我直接使用@Before每次加載。

HDFS文件下載的API接口是copyToLocalFile，具體代碼如下。

@Test public void testDownload() throws Exception {     Configuration configuration = new Configuration();     FileSystem fileSystem = FileSystem.get(             new URI("hdfs://192.168.147.128:9000"),             configuration,             "hadoop");     fileSystem.copyToLocalFile(             false,             new Path("/user/stopword.txt"),             new Path("stop.txt"),             true);     fileSystem.close();     System.out.println("over"); }

Python API

下面主要介紹hdfs

我們通過命令pip install hdfs安裝hdfs庫，在使用hdfs前，使用命令hadoop fs -chmod -R 777 / 對當前目錄及目錄下所有的文件賦予可讀可寫可執(zhí)行權限。

>>> from hdfs.client import Client >>> #2.X版本port 使用50070  3.x版本port 使用9870 >>> client = Client('http://192.168.147.128:9870')   >>> client.list('/')   #查看hdfs /下的目錄 ['hadoop-3.1.4.tar.gz'] >>> client.makedirs('/test') >>> client.list('/') ['hadoop-3.1.4.tar.gz', 'test'] >>> client.delete("/test") True >>> client.download('/hadoop-3.1.4.tar.gz','C:\\Users\\YIUYE\\Desktop') 'C:\\Users\\YIUYE\\Desktop\\hadoop-3.1.4.tar.gz' >>> client.upload('/','C:\\Users\\YIUYE\\Desktop\\demo.txt') >>> client.list('/') '/demo.txt' >>> client.list('/') ['demo.txt', 'hadoop-3.1.4.tar.gz'] >>> # 上傳demo.txt 內容：Hello \n hdfs >>> with client.read("/demo.txt") as reader: ...          print(reader.read()) b'Hello \r\nhdfs\r\n'

相對于Java API，Python API連接實在簡單。

看完上述內容，你們對如何讓HDFS中的Java和Python API接口連接有進一步的了解嗎？如果還想了解更多知識或者相關內容，請關注創(chuàng)新互聯行業(yè)資訊頻道，感謝大家的支持。

新聞標題：如何讓HDFS中的Java和PythonAPI接口連接
文章出自：http://muchs.cn/article22/gespjc.html

成都網站建設公司_創(chuàng)新互聯，為您提供云服務器、網站改版、外貿網站建設、企業(yè)網站制作、建站公司、電子商務

聲明：本網站發(fā)布的內容（圖片、視頻和文字）以用戶投稿、用戶轉載內容為主，如果涉及侵權請盡快告知，我們將會在第一時間刪除。文章觀點不代表本網站立場，如需處理請聯系客服。電話：028-86922220；郵箱：631063699@qq.com。內容未經允許不得轉載，或轉載時需注明來源：創(chuàng)新互聯

猜你還喜歡下面的內容