如何使用TunnelSDK上傳和下載MaxCompute復(fù)雜類型數(shù)據(jù)

如何使用Tunnel SDK上傳和下載MaxCompute復(fù)雜類型數(shù)據(jù)，相信很多沒有經(jīng)驗(yàn)的人對(duì)此束手無(wú)策，為此本文總結(jié)了問題出現(xiàn)的原因和解決方法，通過這篇文章希望你能解決這個(gè)問題。

創(chuàng)新互聯(lián)建站主要從事網(wǎng)站建設(shè)、做網(wǎng)站、網(wǎng)頁(yè)設(shè)計(jì)、企業(yè)做網(wǎng)站、公司建網(wǎng)站等業(yè)務(wù)。立足成都服務(wù)蘇尼特右,十多年網(wǎng)站建設(shè)經(jīng)驗(yàn),價(jià)格優(yōu)惠、服務(wù)專業(yè),歡迎來(lái)電咨詢建站服務(wù):13518219792

基于Tunnel SDK如何上傳復(fù)雜類型數(shù)據(jù)到MaxCompute？首先介紹一下MaxCompute復(fù)雜數(shù)據(jù)類型：

復(fù)雜數(shù)據(jù)類型

MaxCompute采用基于ODPS2.0的SQL引擎，豐富了對(duì)復(fù)雜數(shù)據(jù)類型類型的支持。MaxCompute支持ARRAY, MAP, STRUCT類型，并且可以任意嵌套使用并提供了配套的內(nèi)建函數(shù)。

類型	定義方法	構(gòu)造方法
ARRAY	array;array>	array(1, 2, 3); array(array(1, 2); array(3, 4))
MAP	map;map>	map(“k1”, “v1”, “k2”, “v2”);map(1S, array(‘a(chǎn)’, ‘b’), 2S, array(‘x’, ‘y))
STRUCT	struct;struct< field1:bigint, field2:array, field3:map>	named_struct(‘x’, 1, ‘y’, 2);named_struct(‘field1’, 100L, ‘field2’, array(1, 2), ‘field3’, map(1, 100, 2, 200)

復(fù)雜類型構(gòu)造與操作函數(shù)

返回類型	簽名	注釋
MAP	map(K key1, V value1, K key2, V value2, ...)	使用給定key/value對(duì)建立map, 所有key類型一致，必須是基本類型，所有value類型一致，可為任意類型
ARRAY	map_keys(Map m)	將參數(shù)中的map的所有key作為數(shù)組返回，輸入NULL，返回NULL
ARRAY	map_values(MAP m)	將參數(shù)中的map的所有value作為數(shù)組返回，輸入NULL，返回NULL
int	size(MAP)	取得給定MAP元素?cái)?shù)目
TABLE	explode(MAP)	表生成函數(shù)，將給定MAP展開，每個(gè)key/value一行，每行兩列分別對(duì)應(yīng)key和value
ARRAY	array(T value1, T value2, ...)	使用給定value構(gòu)造ARRAY，所有value類型一致
int	size(ARRAY)	取得給定ARRAY元素?cái)?shù)目
boolean	array_contains(ARRAY a, value v)	檢測(cè)給定ARRAY a中是否包含v
ARRAY	sort_array(ARRAY)	對(duì)給定數(shù)組排序
ARRAY	collect_list(T col)	聚合函數(shù)，在給定group內(nèi)，將col指定的表達(dá)式聚合為一個(gè)數(shù)組
ARRAY	collect_set(T col)	聚合函數(shù)，在給定group內(nèi)，將col指定的表達(dá)式聚合為一個(gè)無(wú)重復(fù)元素的集合數(shù)組
TABLE	explode(ARRAY)	表生成函數(shù)，將給定ARRAY展開，每個(gè)value一行，每行一列對(duì)應(yīng)相應(yīng)數(shù)組元素
TABLE (int, T)	posexplode(ARRAY)	表生成函數(shù)，將給定ARRAY展開，每個(gè)value一行，每行兩列分別對(duì)應(yīng)數(shù)組從0開始的下標(biāo)和數(shù)組元素
STRUCT	struct(T1 value1, T2 value2, ...)	使用給定value列表建立struct, 各value可為任意類型，生成struct的field的名稱依次為col1, col2, ...
STRUCT	named_struct(name1, value1, name2, value2, ...)	使用給定name/value列表建立struct, 各value可為任意類型，生成struct的field的名稱依次為name1, name2, ...
TABLE (f1 T1, f2 T2, ...)	inline(ARRAY>)	表生成函數(shù)，將給定struct數(shù)組展開，每個(gè)元素對(duì)應(yīng)一行，每行每個(gè)struct元素對(duì)應(yīng)一列

Tunnel SDK 介紹

Tunnel 是 ODPS 的數(shù)據(jù)通道，用戶可以通過 Tunnel 向 ODPS 中上傳或者下載數(shù)據(jù)。
TableTunnel 是訪問 ODPS Tunnel 服務(wù)的入口類，僅支持表數(shù)據(jù)（非視圖）的上傳和下載。

對(duì)一張表或 partition 上傳下載的過程，稱為一個(gè)session。session 由一或多個(gè)到 Tunnel RESTful API 的 HTTP Request 組成。
session 用 session ID 來(lái)標(biāo)識(shí)，session 的超時(shí)時(shí)間是24小時(shí)，如果大批量數(shù)據(jù)傳輸導(dǎo)致超過24小時(shí)，需要自行拆分成多個(gè) session。
數(shù)據(jù)的上傳和下載分別由 TableTunnel.UploadSession 和 TableTunnel.DownloadSession 這兩個(gè)會(huì)話來(lái)負(fù)責(zé)。
TableTunnel 提供創(chuàng)建 UploadSession 對(duì)象和 DownloadSession 對(duì)象的方法.

典型表數(shù)據(jù)上傳流程：
1) 創(chuàng)建 TableTunnel
2) 創(chuàng)建 UploadSession
3) 創(chuàng)建 RecordWriter,寫入 Record
4）提交上傳操作
典型表數(shù)據(jù)下載流程：
1) 創(chuàng)建 TableTunnel
2) 創(chuàng)建 DownloadSession
3) 創(chuàng)建 RecordReader,讀取 Record

基于Tunnel SDK構(gòu)造復(fù)雜類型數(shù)據(jù)

代碼示例：

            RecordWriter recordWriter = uploadSession.openRecordWriter(0);
      ArrayRecord record = (ArrayRecord) uploadSession.newRecord();

      // prepare data
      List arrayData = Arrays.asList(1, 2, 3);
      Map<String, Long> mapData = new HashMap<String, Long>();
      mapData.put("a", 1L);
      mapData.put("c", 2L);

      List<Object> structData = new ArrayList<Object>();
      structData.add("Lily");
      structData.add(18);

      // set data to record
      record.setArray(0, arrayData);
      record.setMap(1, mapData);
      record.setStruct(2, new SimpleStruct((StructTypeInfo) schema.getColumn(2).getTypeInfo(),
                                           structData));

      // write the record
      recordWriter.write(record);

從MaxCompute下載復(fù)雜類型數(shù)據(jù)

代碼示例：

            RecordReader recordReader = downloadSession.openRecordReader(0, 1);

      // read the record
      ArrayRecord record1 = (ArrayRecord)recordReader.read();

      // get array field data
      List field0 = record1.getArray(0);
      List<Long> longField0 = record1.getArray(Long.class, 0);

      // get map field data
      Map field1 = record1.getMap(1);
      Map<String, Long> typedField1 = record1.getMap(String.class, Long.class, 1);

      // get struct field data
      Struct field2 = record1.getStruct(2);

運(yùn)行實(shí)例

完整代碼如下：

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import com.aliyun.odps.Odps;
import com.aliyun.odps.PartitionSpec;
import com.aliyun.odps.TableSchema;
import com.aliyun.odps.account.Account;
import com.aliyun.odps.account.AliyunAccount;
import com.aliyun.odps.data.ArrayRecord;
import com.aliyun.odps.data.RecordReader;
import com.aliyun.odps.data.RecordWriter;
import com.aliyun.odps.data.SimpleStruct;
import com.aliyun.odps.data.Struct;
import com.aliyun.odps.tunnel.TableTunnel;
import com.aliyun.odps.tunnel.TableTunnel.UploadSession;
import com.aliyun.odps.tunnel.TableTunnel.DownloadSession;
import com.aliyun.odps.tunnel.TunnelException;
import com.aliyun.odps.type.StructTypeInfo;

public class TunnelComplexTypeSample {

  private static String accessId = "<your access id>";
  private static String accessKey = "<your access Key>";
  private static String odpsUrl = "<your odps endpoint>";
  private static String project = "<your project>";

  private static String table = "<your table name>";

  // partitions of a partitioned table, eg: "pt=\'1\',ds=\'2\'"
  // if the table is not a partitioned table, do not need it
  private static String partition = "<your partition spec>";

  public static void main(String args[]) {
    Account account = new AliyunAccount(accessId, accessKey);
    Odps odps = new Odps(account);
    odps.setEndpoint(odpsUrl);
    odps.setDefaultProject(project);

    try {
      TableTunnel tunnel = new TableTunnel(odps);
      PartitionSpec partitionSpec = new PartitionSpec(partition);

      // ---------- Upload Data ---------------
      // create upload session for table
      // the table schema is {"col0": ARRAY<BIGINT>, "col1": MAP<STRING, BIGINT>, "col2": STRUCT<name:STRING,age:BIGINT>}
      UploadSession uploadSession = tunnel.createUploadSession(project, table, partitionSpec);
      // get table schema
      TableSchema schema = uploadSession.getSchema();

      // open record writer
      RecordWriter recordWriter = uploadSession.openRecordWriter(0);
      ArrayRecord record = (ArrayRecord) uploadSession.newRecord();

      // prepare data
      List arrayData = Arrays.asList(1, 2, 3);
      Map<String, Long> mapData = new HashMap<String, Long>();
      mapData.put("a", 1L);
      mapData.put("c", 2L);

      List<Object> structData = new ArrayList<Object>();
      structData.add("Lily");
      structData.add(18);

      // set data to record
      record.setArray(0, arrayData);
      record.setMap(1, mapData);
      record.setStruct(2, new SimpleStruct((StructTypeInfo) schema.getColumn(2).getTypeInfo(),
                                           structData));

      // write the record
      recordWriter.write(record);

      // close writer
      recordWriter.close();

      // commit uploadSession, the upload finish
      uploadSession.commit(new Long[]{0L});
      System.out.println("upload success!");

      // ---------- Download Data ---------------
      // create download session for table
      // the table schema is {"col0": ARRAY<BIGINT>, "col1": MAP<STRING, BIGINT>, "col2": STRUCT<name:STRING,age:BIGINT>}
      DownloadSession downloadSession = tunnel.createDownloadSession(project, table, partitionSpec);
      schema = downloadSession.getSchema();

      // open record reader, read one record here for example
      RecordReader recordReader = downloadSession.openRecordReader(0, 1);

      // read the record
      ArrayRecord record1 = (ArrayRecord)recordReader.read();

      // get array field data
      List field0 = record1.getArray(0);
      List<Long> longField0 = record1.getArray(Long.class, 0);

      // get map field data
      Map field1 = record1.getMap(1);
      Map<String, Long> typedField1 = record1.getMap(String.class, Long.class, 1);

      // get struct field data
      Struct field2 = record1.getStruct(2);

      System.out.println("download success!");
    } catch (TunnelException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }

  }
}

看完上述內(nèi)容，你們掌握如何使用Tunnel SDK上傳和下載MaxCompute復(fù)雜類型數(shù)據(jù)的方法了嗎？如果還想學(xué)到更多技能或想了解更多相關(guān)內(nèi)容，歡迎關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道，感謝各位的閱讀！

網(wǎng)站標(biāo)題：如何使用TunnelSDK上傳和下載MaxCompute復(fù)雜類型數(shù)據(jù)
當(dāng)前地址：http://muchs.cn/article16/gphegg.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián)，為您提供外貿(mào)網(wǎng)站建設(shè)、做網(wǎng)站、響應(yīng)式網(wǎng)站、用戶體驗(yàn)、靜態(tài)網(wǎng)站、自適應(yīng)網(wǎng)站

聲明：本網(wǎng)站發(fā)布的內(nèi)容（圖片、視頻和文字）以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主，如果涉及侵權(quán)請(qǐng)盡快告知，我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng)，如需處理請(qǐng)聯(lián)系客服。電話：028-86922220；郵箱：631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載，或轉(zhuǎn)載時(shí)需注明來(lái)源：創(chuàng)新互聯(lián)

猜你還喜歡下面的內(nèi)容