本篇文章将介绍以主用户模式调用REST API来访问或处理SharePoint Online上的文件,主要用于没有SharePoint官方接口的应用程序或需要以编程方式访问SharePoint时的场景。
获取访问令牌
在调用SharePoint的REST API前,需要先获取到访问令牌,可以使用主用户模式或服务主体模式。但通过服务主体模式访问Azure应用来获取访问令牌时,需要使用PowerShell生成并上传证书,且最后调用时较麻烦,因此本篇文章只介绍主用户模式。而关于服务主体模式,将改用SharePoint应用来实现,相比Azure应用简单不少,这将在另一篇文章中进行介绍。
关于获取访问令牌的具体步骤与原理,请参考我的另一篇文章:Azure应用注册与访问令牌获取,并从该文章末尾处复制给出的现成代码,然后填写相关参数,并把验证模式设置为主用户模式即可。此外,由于后续需要调用的是SharePoint的REST API,因此还需要填写SCOPE URL,具体如下图所示:
需要注意,这里必须将验证模式设置为主用户模式,若改用服务主体模式将报错,因为给出的现成代码是通过Azure应用的客户端密码来实现服务主体模式的,而不是上传证书。正如之前所提到的,通过服务主体模式访问Azure应用来获取访问令牌时需要使用上传证书的方式,因为其它方式都被屏蔽了。
调用REST API读取SharePoint上的文件
SharePoint的REST API有很多,这里只介绍读取文件所用到的API,若对其它API感兴趣,可以自行浏览官方文档:了解 SharePoint REST 服务。
API接口:
Url:https://{site_url}/_api/web/GetFolderByServerRelativeUrl('{folder_path}')/Files('{file_name}')/$value
Method:GET
Request Headers: {Authorization:访问令牌}
参数说明:
1. site_url
SharePoint站点的根目录,可打开对应的SharePoint站点后从浏览器上方的地址栏处获得,格式为:https://TenantName.sharepoint.com/sites/SiteName
2. folder_path
需要读取的文件所位于的文件夹的路径,可使用绝对路径或相对路径。绝对路径以/SiteName为起点,注意前面的斜杠;相对路径则以Shared Documents为起点,注意前面是没有斜杠的。
为帮助理解,以下图所示的Excel文件所在位置为例,其所位于的文件夹的路径为:
绝对路径:/SharePointRESTAPIReadFile/Shared Documents/TestFolder
相对路径:Shared Documents/TestFolder
3. file_name
需要读取的文件的文件名,含格式后缀
4. Authorization
前面获取到的访问令牌,需要作为请求头参数传递
当所有参数都获取到后,就可以调用该REST API读取文件了。以Python为例,调用REST API读取文件的代码如下:
site_url = "https://TenantName.sharepoint.com/sites/SiteName"
response = requests.get(
"{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
headers={'Authorization':token}
)
if response.status_code in [200,202]:
with io.BytesIO(response.content) as fh:
df = pd.read_excel(fh, sheet_name='SheetName')
print(df)
else:
print(response.text)
完整的实现代码
为方便使用,下面给出从获取访问令牌到调用REST API读取SharePoint文件的完整实现代码,有多个版本,但都需要使用主用户模式,不支持服务主体模式。
PowerQuery OAuth2.0 :
let
// Define a function to acquire access token.
get_access_token = (ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL) =>
/*
Author: 夕枫
Function Description:
Generates and returns Access token
Returns:
string: Access token
Below params if don't need to use, pls use blank string to replace, such as: ""
ENVIRONMENT_TYPE : Can be set to 'Global' or '21Vianet'
AUTHENTICATION_MODE : Can be set to 'MasterUser' or 'ServicePrincipal'
CLIENT_ID : Client Id (Application Id) of the AAD app
TENANT_ID : Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET : Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
USER_ACCOUNT : Master user email address. Required only for MasterUser authentication mode.
USER_PASSWORD : Master user email password. Required only for MasterUser authentication mode.
SCOPE_BASE : Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
AUTHORITY_URL : URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
*/
try
if not List.Contains({"global","21vianet"},Text.Lower(ENVIRONMENT_TYPE)) then
error "Error! Pls check the input of ENVIRONMENT_TYPE."
else let
SCOPE_BASE = if SCOPE_BASE<>"" then SCOPE_BASE else if Text.Lower(ENVIRONMENT_TYPE)="global" then "https://analysis.windows.net/powerbi/api/.default" else "https://analysis.chinacloudapi.cn/powerbi/api/.default",
AUTHORITY_URL = if AUTHORITY_URL<>"" then SCOPE_BASE else if Text.Lower(ENVIRONMENT_TYPE)="global" then "https://login.microsoftonline.com/organizations/oauth2/v2.0/token" else "https://login.chinacloudapi.cn/organizations/oauth2/v2.0/token",
RESPONSE =
if Text.Lower(AUTHENTICATION_MODE) = "masteruser" then
Web.Contents(
AUTHORITY_URL,
[
Headers = [#"Content-Type"="application/x-www-form-urlencoded"],
Content =
Text.ToBinary(
Text.Format(
"client_id=#{0}&scope=#{1}&username=#{2}&password=#{3}&grant_type=password",
{CLIENT_ID,SCOPE_BASE,USER_ACCOUNT,USER_PASSWORD}
)
)
]
)
else if Text.Lower(AUTHENTICATION_MODE) = "serviceprincipal" then
Web.Contents(
Text.Replace(AUTHORITY_URL,"organizations",TENANT_ID),
[
Headers = [#"Content-Type"="application/x-www-form-urlencoded"],
Content =
Text.ToBinary(
Text.Format(
"client_id=#{0}&scope=#{1}&client_secret=#{2}&grant_type=client_credentials",
{CLIENT_ID,SCOPE_BASE,CLIENT_SECRET}
)
)
]
)
else
error "Error! Pls check the input of AUTHENTICATION_MODE."
in "Bearer " & Json.Document(RESPONSE)[access_token]
otherwise error "Error retrieving Access token, pls check the input params.",
// ------------------------------------------------config setting-----------------------------------------------------
// Can be set to 'Global' or '21Vianet'
ENVIRONMENT_TYPE = "Global",
// Can be set to 'MasterUser' or 'ServicePrincipal'
AUTHENTICATION_MODE = "MasterUser",
// Client Id (Application Id) of the AAD app
CLIENT_ID = "Input CLIENT_ID here.",
// Below params if don't need to use, pls use blank string to replace, such as: ''
// Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
TENANT_ID = "",
// Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET = "",
// Master user email address. Required only for MasterUser authentication mode.
USER_ACCOUNT = "YourMicrosoftAccount",
// Master user email password. Required only for MasterUser authentication mode.
USER_PASSWORD = "YourPassWord",
// If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API.
SCOPE_BASE = "https://TenantName.sharepoint.com/.default",
// The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
AUTHORITY_URL = "",
// ------------------------------------------------config setting end-------------------------------------------------
TOKEN = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL),
// -------------------------------------------Call REST API by above TOKEN--------------------------------------------
// CallAPI = xxxxx
Site_Url = "https://TenantName.sharepoint.com/sites/SiteName",
Response =
Web.Contents(
Site_Url & "/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value",
[
Headers = [Authorization=TOKEN]
]
),
Result = Excel.Workbook(Response,true,null)
in
Result
Python OAuth2.0 :
import requests
import os,io
import pandas as pd
def get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE='',AUTHORITY_URL=''):
'''
Author: 夕枫
Function Description:
Generates and returns Access token
Returns:
string: Access token
Below params if don't need to use, pls use blank string to replace, such as: ''
ENVIRONMENT_TYPE : Can be set to 'Global' or '21Vianet'
AUTHENTICATION_MODE : Can be set to 'MasterUser' or 'ServicePrincipal'
CLIENT_ID : Client Id (Application Id) of the AAD app
TENANT_ID : Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET : Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
USER_ACCOUNT : Master user email address. Required only for MasterUser authentication mode.
USER_PASSWORD : Master user email password. Required only for MasterUser authentication mode.
SCOPE_BASE : Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
AUTHORITY_URL : URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
'''
if ENVIRONMENT_TYPE.lower() not in ['global','21vianet']:
return 'Error! Pls check the input of ENVIRONMENT_TYPE.'
if SCOPE_BASE=='':
SCOPE_BASE = 'https://analysis.windows.net/powerbi/api/.default' if ENVIRONMENT_TYPE.lower()=="global" else 'https://analysis.chinacloudapi.cn/powerbi/api/.default'
if AUTHORITY_URL=='':
AUTHORITY_URL = 'https://login.microsoftonline.com/organizations/oauth2/v2.0/token' if ENVIRONMENT_TYPE.lower()=="global" else 'https://login.chinacloudapi.cn/organizations/oauth2/v2.0/token'
try:
if AUTHENTICATION_MODE.lower() == 'masteruser':
response = requests.post(
AUTHORITY_URL,
headers={'Content-Type':'application/x-www-form-urlencoded'},
data={
'client_id':CLIENT_ID,
'scope':SCOPE_BASE,
'username':USER_ACCOUNT,
'password':USER_PASSWORD,
'grant_type':'password'
}
)
elif AUTHENTICATION_MODE.lower() == 'serviceprincipal':
response = requests.post(
AUTHORITY_URL.replace('organizations', TENANT_ID),
headers={'Content-Type':'application/x-www-form-urlencoded'},
data={
'client_id':CLIENT_ID,
'scope':SCOPE_BASE,
'client_secret':CLIENT_SECRET,
'grant_type':'client_credentials'
}
)
else:
return 'Error! Pls check the input of AUTHENTICATION_MODE.'
return 'Bearer ' + response.json()['access_token']
except Exception as ex:
return 'Error retrieving Access token : '+str(ex)
if __name__ == '__main__':
# ----------------------------------------------------------config setting----------------------------------------------------------------
# Can be set to 'Global' or '21Vianet'
ENVIRONMENT_TYPE = 'Global'
# Can be set to 'MasterUser' or 'ServicePrincipal'
AUTHENTICATION_MODE = 'MasterUser'
# Client Id (Application Id) of the AAD app
CLIENT_ID = 'Input CLIENT_ID here.'
# Below params if don't need to use, pls use blank string to replace, such as: ''
# Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
TENANT_ID = ''
# Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET = ''
# Master user email address. Required only for MasterUser authentication mode.
USER_ACCOUNT = 'YourMicrosoftAccount'
# Master user email password. Required only for MasterUser authentication mode.
USER_PASSWORD = 'YourPassWord'
# If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API.
SCOPE_BASE = 'https://TenantName.sharepoint.com/.default'
# The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
AUTHORITY_URL = ''
# ---------------------------------------------------------config setting end---------------------------------------------------------------
token = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL)
if 'Bearer' in token:
# Get Excel File Content
site_url = "https://TenantName.sharepoint.com/sites/SiteName"
response = requests.get(
"{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
headers={'Authorization':token}
)
if response.status_code in [200,202]:
with io.BytesIO(response.content) as fh:
df = pd.read_excel(fh, sheet_name='SheetName')
print(df)
else:
print(response.text)
else:
print(token)
os.system('pause')
Python MSAL :
import msal
import requests
import os,io
import pandas as pd
def get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE='',AUTHORITY_URL=''):
'''
Author: 夕枫
Function Description:
Generates and returns Access token
Returns:
string: Access token
Below params if don't need to use, pls use blank string to replace, such as: ''
ENVIRONMENT_TYPE : Can be set to 'Global' or '21Vianet'
AUTHENTICATION_MODE : Can be set to 'MasterUser' or 'ServicePrincipal'
CLIENT_ID : Client Id (Application Id) of the AAD app
TENANT_ID : Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET : Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
USER_ACCOUNT : Master user email address. Required only for MasterUser authentication mode.
USER_PASSWORD : Master user email password. Required only for MasterUser authentication mode.
SCOPE_BASE : Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
AUTHORITY_URL : URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
'''
if ENVIRONMENT_TYPE.lower() not in ['global','21vianet']:
return 'Error! Pls check the input of ENVIRONMENT_TYPE.'
if SCOPE_BASE=='':
SCOPE_BASE = ['https://analysis.windows.net/powerbi/api/.default'] if ENVIRONMENT_TYPE.lower()=="global" else ['https://analysis.chinacloudapi.cn/powerbi/api/.default']
if AUTHORITY_URL=='':
AUTHORITY_URL = 'https://login.microsoftonline.com/organizations' if ENVIRONMENT_TYPE.lower()=="global" else 'https://login.chinacloudapi.cn/organizations'
try:
response = None
if AUTHENTICATION_MODE.lower() == 'masteruser':
# Create a public client to authorize the app with the AAD app
clientapp = msal.PublicClientApplication(CLIENT_ID, authority=AUTHORITY_URL)
accounts = clientapp.get_accounts(username=USER_ACCOUNT)
if accounts:
# Retrieve Access token from user cache if available
response = clientapp.acquire_token_silent(SCOPE_BASE, account=accounts[0])
if not response:
# Make a client call if Access token is not available in cache
response = clientapp.acquire_token_by_username_password(USER_ACCOUNT, USER_PASSWORD, scopes=SCOPE_BASE)
elif AUTHENTICATION_MODE.lower() == 'serviceprincipal':
authority = AUTHORITY_URL.replace('organizations', TENANT_ID)
clientapp = msal.ConfidentialClientApplication(CLIENT_ID, client_credential=CLIENT_SECRET, authority=authority)
# Make a client call if Access token is not available in cache
response = clientapp.acquire_token_for_client(scopes=SCOPE_BASE)
else:
return 'Error! Pls check the input of AUTHENTICATION_MODE.'
return 'Bearer ' + response['access_token']
except Exception as ex:
return 'Error retrieving Access token : ' + str(ex)
if __name__ == '__main__':
# ----------------------------------------------------------config setting----------------------------------------------------------------
# Can be set to 'Global' or '21Vianet'
ENVIRONMENT_TYPE = 'Global'
# Can be set to 'MasterUser' or 'ServicePrincipal'
AUTHENTICATION_MODE = 'MasterUser'
# Client Id (Application Id) of the AAD app
CLIENT_ID = 'Input CLIENT_ID here.'
# Below params if don't need to use, pls use blank string to replace, such as: ''
# Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
TENANT_ID = ''
# Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
CLIENT_SECRET = ''
# Master user email address. Required only for MasterUser authentication mode.
USER_ACCOUNT = 'YourMicrosoftAccount'
# Master user email password. Required only for MasterUser authentication mode.
USER_PASSWORD = 'YourPassWord'
# If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API. And the input scopes should be a list, tuple, or set.
SCOPE_BASE = ['https://TenantName.sharepoint.com/.default']
# The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
AUTHORITY_URL = ''
# ---------------------------------------------------------config setting end---------------------------------------------------------------
token = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL)
if 'Bearer' in token:
# Get Excel File Content
site_url = "https://TenantName.sharepoint.com/sites/SiteName"
response = requests.get(
"{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
headers={'Authorization':token}
)
if response.status_code in [200,202]:
with io.BytesIO(response.content) as fh:
df = pd.read_excel(fh, sheet_name='SheetName')
print(df)
else:
print(response.text)
else:
print(token)
os.system('pause')
Python Office365-REST-Python-Client :
# pip install Office365-REST-Python-Client
# pip install pandas
# https://github.com/vgrem/Office365-REST-Python-Client
# Author: 夕枫
from office365.runtime.auth.user_credential import UserCredential
from office365.runtime.http.request_options import RequestOptions
from office365.sharepoint.client_context import ClientContext
import os,io
import pandas as pd
site_url = "https://TenantName.sharepoint.com/sites/SiteName"
# MasterUser Mode, accessing resources with account password
ctx = ClientContext(site_url).with_credentials(UserCredential("YourMicrosoftAccount", "YourPassWord"))
# Retrieve all files under a folder, excluding files under subfolders
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
print(response.text)
# Retrieve a specific file under a folder and return information such as its properties
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
print(response.text)
# Retrieve a specific file under a folder and return binary content
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
with io.BytesIO(response.content) as fh:
df = pd.read_excel(fh, sheet_name='SheetName')
print(df)
# If you know the URL address of a specific file, you can directly access that specific file and return binary content
request = RequestOptions("https://TenantName.sharepoint.com/sites/SiteName/Shared Documents/FolderName1/FolderName2/../FolderName/ExcelFileName.xlsx")
response = ctx.pending_request().execute_request_direct(request)
with io.BytesIO(response.content) as fh:
df = pd.read_excel(fh, sheet_name='SheetName')
print(df)
os.system('pause')