步骤一:创建Shared Access Signature(SAS) URL。
步骤二:在DBFS中使用dbutils.fs.mount(source=..,mount_point=..,extra_configs=..)
加载Azure Blob
SasURL = "https://dbtraineastus2.blob.core.windows.net/?sv=2017-07-29&ss=b&srt=sco&sp=rl&se=2023-04-19T06:32:30Z&st=2018-04-18T22:32:30Z&spr=https&sig=BB%2FQzc0XHAH%2FarDQhKcpu49feb7llv3ZjnfViuI9IWo%3D"
indQuestionMark = SasURL.index('?')
SasKey = SasURL[indQuestionMark:len(SasURL)]
StorageAccount = "dbtraineastus2"
ContainerName = "training"
MountPoint = "/mnt/temp-training"
dbutils.fs.mount(
source = "wasbs://%s@%s.blob.core.windows.net/" % (ContainerName, StorageAccount),
mount_point = MountPoint,
extra_configs = {"fs.azure.sas.%s.%s.blob.core.windows.net" % (ContainerName, StorageAccount) : "%s" % SasKey}
)
查看从Azure Blob中的加载的文件
%fs ls /mnt/temp-training
从Azure Blob中的CSV文件创建DataFrame
val myDF = spark.read.option("header","true").option("inferSchema","true").csv("dbfs:/mnt/temp-training/myfile.csv")
调用dbutils.fs.unmount()
,并将加载时候定义的mount_point作为参数传入。
dbutils.fs.unmount("/mnt/temp-training")