回忆是一座桥
却是通往寂寞的牢

通过REST API读取SharePoint Online上的文件 - 主用户模式

本篇文章将介绍以主用户模式调用REST API来访问或处理SharePoint Online上的文件,主要用于没有SharePoint官方接口的应用程序或需要以编程方式访问SharePoint时的场景。

获取访问令牌

在调用SharePoint的REST API前,需要先获取到访问令牌,可以使用主用户模式或服务主体模式。但通过服务主体模式访问Azure应用来获取访问令牌时,需要使用PowerShell生成并上传证书,且最后调用时较麻烦,因此本篇文章只介绍主用户模式。而关于服务主体模式,将改用SharePoint应用来实现,相比Azure应用简单不少,这将在另一篇文章中进行介绍。

关于获取访问令牌的具体步骤与原理,请参考我的另一篇文章:Azure应用注册与访问令牌获取,并从该文章末尾处复制给出的现成代码,然后填写相关参数,并把验证模式设置为主用户模式即可。此外,由于后续需要调用的是SharePoint的REST API,因此还需要填写SCOPE URL,具体如下图所示:

需要注意,这里必须将验证模式设置为主用户模式,若改用服务主体模式将报错,因为给出的现成代码是通过Azure应用的客户端密码来实现服务主体模式的,而不是上传证书。正如之前所提到的,通过服务主体模式访问Azure应用来获取访问令牌时需要使用上传证书的方式,因为其它方式都被屏蔽了。

调用REST API读取SharePoint上的文件

SharePoint的REST API有很多,这里只介绍读取文件所用到的API,若对其它API感兴趣,可以自行浏览官方文档:了解 SharePoint REST 服务

API接口:

Url:https://{site_url}/_api/web/GetFolderByServerRelativeUrl('{folder_path}')/Files('{file_name}')/$value

Method:GET
Request Headers: {Authorization:访问令牌}

参数说明:

1. site_url

SharePoint站点的根目录,可打开对应的SharePoint站点后从浏览器上方的地址栏处获得,格式为:https://TenantName.sharepoint.com/sites/SiteName

2. folder_path

需要读取的文件所位于的文件夹的路径,可使用绝对路径或相对路径。绝对路径以/SiteName为起点,注意前面的斜杠;相对路径则以Shared Documents为起点,注意前面是没有斜杠的。

为帮助理解,以下图所示的Excel文件所在位置为例,其所位于的文件夹的路径为:

绝对路径:/SharePointRESTAPIReadFile/Shared Documents/TestFolder

相对路径:Shared Documents/TestFolder

3. file_name

需要读取的文件的文件名,含格式后缀

4. Authorization

前面获取到的访问令牌,需要作为请求头参数传递


当所有参数都获取到后,就可以调用该REST API读取文件了。以Python为例,调用REST API读取文件的代码如下:

site_url = "https://TenantName.sharepoint.com/sites/SiteName"

response = requests.get(
    "{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
    headers={'Authorization':token}
)

if response.status_code in [200,202]:
    with io.BytesIO(response.content) as fh:
        df = pd.read_excel(fh, sheet_name='SheetName')
        print(df)
else:
    print(response.text)

完整的实现代码

为方便使用,下面给出从获取访问令牌到调用REST API读取SharePoint文件的完整实现代码,有多个版本,但都需要使用主用户模式,不支持服务主体模式。


PowerQuery OAuth2.0 :

let
    // Define a function to acquire access token. 
    get_access_token = (ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL) =>
        /*
        Author: 夕枫
        Function Description: 
            Generates and returns Access token
        Returns:
            string: Access token

        Below params if don't need to use, pls use blank string to replace, such as: "" 

        ENVIRONMENT_TYPE :      Can be set to 'Global' or '21Vianet'
        AUTHENTICATION_MODE :   Can be set to 'MasterUser' or 'ServicePrincipal'
        CLIENT_ID :             Client Id (Application Id) of the AAD app
        TENANT_ID :             Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
        CLIENT_SECRET :         Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
        USER_ACCOUNT :          Master user email address. Required only for MasterUser authentication mode.
        USER_PASSWORD :         Master user email password. Required only for MasterUser authentication mode.
        SCOPE_BASE :            Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
        AUTHORITY_URL :         URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
        */
        try
            if not List.Contains({"global","21vianet"},Text.Lower(ENVIRONMENT_TYPE)) then
                error "Error! Pls check the input of ENVIRONMENT_TYPE."
            else let
                SCOPE_BASE = if SCOPE_BASE<>"" then SCOPE_BASE else if Text.Lower(ENVIRONMENT_TYPE)="global" then "https://analysis.windows.net/powerbi/api/.default" else "https://analysis.chinacloudapi.cn/powerbi/api/.default",
                AUTHORITY_URL = if AUTHORITY_URL<>"" then SCOPE_BASE else if Text.Lower(ENVIRONMENT_TYPE)="global" then "https://login.microsoftonline.com/organizations/oauth2/v2.0/token" else "https://login.chinacloudapi.cn/organizations/oauth2/v2.0/token",
                RESPONSE = 
                    if Text.Lower(AUTHENTICATION_MODE) = "masteruser" then
                        Web.Contents(
                            AUTHORITY_URL,
                            [
                                Headers = [#"Content-Type"="application/x-www-form-urlencoded"],
                                Content = 
                                    Text.ToBinary(
                                        Text.Format(
                                            "client_id=#{0}&scope=#{1}&username=#{2}&password=#{3}&grant_type=password",
                                            {CLIENT_ID,SCOPE_BASE,USER_ACCOUNT,USER_PASSWORD}
                                        )
                                    )
                            ]
                        )
                    else if Text.Lower(AUTHENTICATION_MODE) = "serviceprincipal" then
                        Web.Contents(
                            Text.Replace(AUTHORITY_URL,"organizations",TENANT_ID),
                            [
                                Headers = [#"Content-Type"="application/x-www-form-urlencoded"],
                                Content = 
                                    Text.ToBinary(
                                        Text.Format(
                                            "client_id=#{0}&scope=#{1}&client_secret=#{2}&grant_type=client_credentials",
                                            {CLIENT_ID,SCOPE_BASE,CLIENT_SECRET}
                                        )
                                    )
                            ]
                        )
                    else
                        error "Error! Pls check the input of AUTHENTICATION_MODE."
                in "Bearer " & Json.Document(RESPONSE)[access_token]
        otherwise error "Error retrieving Access token, pls check the input params.",

    // ------------------------------------------------config setting-----------------------------------------------------
    // Can be set to 'Global' or '21Vianet'
    ENVIRONMENT_TYPE = "Global",

    // Can be set to 'MasterUser' or 'ServicePrincipal'
    AUTHENTICATION_MODE = "MasterUser",

    // Client Id (Application Id) of the AAD app
    CLIENT_ID = "Input CLIENT_ID here.",

    // Below params if don't need to use, pls use blank string to replace, such as: '' 

    // Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
    TENANT_ID = "",

    // Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
    CLIENT_SECRET = "",

    // Master user email address. Required only for MasterUser authentication mode.
    USER_ACCOUNT = "YourMicrosoftAccount",

    // Master user email password. Required only for MasterUser authentication mode.
    USER_PASSWORD = "YourPassWord",

    // If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API.
    SCOPE_BASE = "https://TenantName.sharepoint.com/.default",

    // The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
    AUTHORITY_URL = "",

    // ------------------------------------------------config setting end-------------------------------------------------

    TOKEN = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL),

    // -------------------------------------------Call REST API by above TOKEN--------------------------------------------

    // CallAPI = xxxxx
    Site_Url = "https://TenantName.sharepoint.com/sites/SiteName",
    Response = 
        Web.Contents(
            Site_Url & "/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value",
            [
                Headers = [Authorization=TOKEN]
            ]
        ),
    Result = Excel.Workbook(Response,true,null)
in
    Result

Python OAuth2.0 :

import requests
import os,io
import pandas as pd

def get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE='',AUTHORITY_URL=''):
    '''
    Author: 夕枫
    Function Description: 
        Generates and returns Access token
    Returns:
        string: Access token

    Below params if don't need to use, pls use blank string to replace, such as: '' 

    ENVIRONMENT_TYPE :      Can be set to 'Global' or '21Vianet'
    AUTHENTICATION_MODE :   Can be set to 'MasterUser' or 'ServicePrincipal'
    CLIENT_ID :             Client Id (Application Id) of the AAD app
    TENANT_ID :             Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
    CLIENT_SECRET :         Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
    USER_ACCOUNT :          Master user email address. Required only for MasterUser authentication mode.
    USER_PASSWORD :         Master user email password. Required only for MasterUser authentication mode.
    SCOPE_BASE :            Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
    AUTHORITY_URL :         URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
    '''
    if ENVIRONMENT_TYPE.lower() not in ['global','21vianet']:
        return 'Error! Pls check the input of ENVIRONMENT_TYPE.'

    if SCOPE_BASE=='':
        SCOPE_BASE = 'https://analysis.windows.net/powerbi/api/.default' if ENVIRONMENT_TYPE.lower()=="global" else 'https://analysis.chinacloudapi.cn/powerbi/api/.default'
    if AUTHORITY_URL=='':
        AUTHORITY_URL = 'https://login.microsoftonline.com/organizations/oauth2/v2.0/token' if ENVIRONMENT_TYPE.lower()=="global" else 'https://login.chinacloudapi.cn/organizations/oauth2/v2.0/token'

    try:
        if AUTHENTICATION_MODE.lower() == 'masteruser':
            response = requests.post(
                AUTHORITY_URL,
                headers={'Content-Type':'application/x-www-form-urlencoded'},
                data={
                    'client_id':CLIENT_ID,
                    'scope':SCOPE_BASE,
                    'username':USER_ACCOUNT,
                    'password':USER_PASSWORD,
                    'grant_type':'password'
                }
            )
        elif AUTHENTICATION_MODE.lower() == 'serviceprincipal':
            response = requests.post(
                AUTHORITY_URL.replace('organizations', TENANT_ID),
                headers={'Content-Type':'application/x-www-form-urlencoded'},
                data={
                    'client_id':CLIENT_ID,
                    'scope':SCOPE_BASE,
                    'client_secret':CLIENT_SECRET,
                    'grant_type':'client_credentials'
                }
            )
        else:
            return 'Error! Pls check the input of AUTHENTICATION_MODE.'

        return 'Bearer ' + response.json()['access_token']

    except Exception as ex:
        return 'Error retrieving Access token : '+str(ex)

if __name__ == '__main__':
    # ----------------------------------------------------------config setting----------------------------------------------------------------

    # Can be set to 'Global' or '21Vianet'
    ENVIRONMENT_TYPE = 'Global'      

    # Can be set to 'MasterUser' or 'ServicePrincipal'
    AUTHENTICATION_MODE = 'MasterUser'

    # Client Id (Application Id) of the AAD app
    CLIENT_ID = 'Input CLIENT_ID here.'

    # Below params if don't need to use, pls use blank string to replace, such as: '' 

    # Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
    TENANT_ID = ''

    # Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
    CLIENT_SECRET = ''

    # Master user email address. Required only for MasterUser authentication mode.
    USER_ACCOUNT = 'YourMicrosoftAccount'

    # Master user email password. Required only for MasterUser authentication mode.
    USER_PASSWORD = 'YourPassWord'

    # If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API.
    SCOPE_BASE = 'https://TenantName.sharepoint.com/.default'

    # The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
    AUTHORITY_URL = ''

    # ---------------------------------------------------------config setting end---------------------------------------------------------------

    token = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL)

    if 'Bearer' in token:
        # Get Excel File Content
        site_url = "https://TenantName.sharepoint.com/sites/SiteName"

        response = requests.get(
            "{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
            headers={'Authorization':token}
        )

        if response.status_code in [200,202]:
            with io.BytesIO(response.content) as fh:
                df = pd.read_excel(fh, sheet_name='SheetName')
                print(df)
        else:
            print(response.text)
    else:
        print(token)

    os.system('pause')

Python MSAL :

import msal
import requests
import os,io
import pandas as pd

def get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE='',AUTHORITY_URL=''):
    '''
    Author: 夕枫
    Function Description: 
        Generates and returns Access token
    Returns:
        string: Access token

    Below params if don't need to use, pls use blank string to replace, such as: '' 

    ENVIRONMENT_TYPE :      Can be set to 'Global' or '21Vianet'
    AUTHENTICATION_MODE :   Can be set to 'MasterUser' or 'ServicePrincipal'
    CLIENT_ID :             Client Id (Application Id) of the AAD app
    TENANT_ID :             Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
    CLIENT_SECRET :         Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
    USER_ACCOUNT :          Master user email address. Required only for MasterUser authentication mode.
    USER_PASSWORD :         Master user email password. Required only for MasterUser authentication mode.
    SCOPE_BASE :            Scope Base of AAD app. Required when the REST API dosn't is PowerBI REST API.
    AUTHORITY_URL :         URL used for initiating authorization request, can set to blank string, will use default URL. Required when the default URL fail.
    '''
    if ENVIRONMENT_TYPE.lower() not in ['global','21vianet']:
        return 'Error! Pls check the input of ENVIRONMENT_TYPE.'

    if SCOPE_BASE=='':
        SCOPE_BASE = ['https://analysis.windows.net/powerbi/api/.default'] if ENVIRONMENT_TYPE.lower()=="global" else ['https://analysis.chinacloudapi.cn/powerbi/api/.default']
    if AUTHORITY_URL=='':
        AUTHORITY_URL = 'https://login.microsoftonline.com/organizations' if ENVIRONMENT_TYPE.lower()=="global" else 'https://login.chinacloudapi.cn/organizations'

    try:
        response = None
        if AUTHENTICATION_MODE.lower() == 'masteruser':
            # Create a public client to authorize the app with the AAD app
            clientapp = msal.PublicClientApplication(CLIENT_ID, authority=AUTHORITY_URL)
            accounts = clientapp.get_accounts(username=USER_ACCOUNT)

            if accounts:
                # Retrieve Access token from user cache if available
                response = clientapp.acquire_token_silent(SCOPE_BASE, account=accounts[0])

            if not response:
                # Make a client call if Access token is not available in cache
                response = clientapp.acquire_token_by_username_password(USER_ACCOUNT, USER_PASSWORD, scopes=SCOPE_BASE) 

        elif AUTHENTICATION_MODE.lower() == 'serviceprincipal':
            authority = AUTHORITY_URL.replace('organizations', TENANT_ID)
            clientapp = msal.ConfidentialClientApplication(CLIENT_ID, client_credential=CLIENT_SECRET, authority=authority)

            # Make a client call if Access token is not available in cache
            response = clientapp.acquire_token_for_client(scopes=SCOPE_BASE)

        else:
            return 'Error! Pls check the input of AUTHENTICATION_MODE.'

        return 'Bearer ' + response['access_token']

    except Exception as ex:
        return 'Error retrieving Access token : ' + str(ex)

if __name__ == '__main__':
    # ----------------------------------------------------------config setting----------------------------------------------------------------

    # Can be set to 'Global' or '21Vianet'
    ENVIRONMENT_TYPE = 'Global'      

    # Can be set to 'MasterUser' or 'ServicePrincipal'
    AUTHENTICATION_MODE = 'MasterUser'

    # Client Id (Application Id) of the AAD app
    CLIENT_ID = 'Input CLIENT_ID here.'

    # Below params if don't need to use, pls use blank string to replace, such as: '' 

    # Id of the Azure tenant in which AAD app and Power BI report is hosted. Required only for ServicePrincipal authentication mode.
    TENANT_ID = ''

    # Client Secret (App Secret) of the AAD app. Required only for ServicePrincipal authentication mode.
    CLIENT_SECRET = ''

    # Master user email address. Required only for MasterUser authentication mode.
    USER_ACCOUNT = 'YourMicrosoftAccount'

    # Master user email password. Required only for MasterUser authentication mode.
    USER_PASSWORD = 'YourPassWord'

    # If Use PowerBI REST API, the below parma can set to blank string. Otherwise, pls input the available url of other REST API. And the input scopes should be a list, tuple, or set.
    SCOPE_BASE = ['https://TenantName.sharepoint.com/.default']

    # The below parma can set to blank string to use default AUTHORITY_URL. If the default setting fail, pls input the available url.
    AUTHORITY_URL = ''

    # ---------------------------------------------------------config setting end---------------------------------------------------------------

    token = get_access_token(ENVIRONMENT_TYPE,AUTHENTICATION_MODE,TENANT_ID,CLIENT_ID,CLIENT_SECRET,USER_ACCOUNT,USER_PASSWORD,SCOPE_BASE,AUTHORITY_URL)

    if 'Bearer' in token:
        # Get Excel File Content
        site_url = "https://TenantName.sharepoint.com/sites/SiteName"

        response = requests.get(
            "{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url),
            headers={'Authorization':token}
        )

        if response.status_code in [200,202]:
            with io.BytesIO(response.content) as fh:
                df = pd.read_excel(fh, sheet_name='SheetName')
                print(df)
        else:
            print(response.text)
    else:
        print(token)

    os.system('pause')

Python Office365-REST-Python-Client :

# pip install Office365-REST-Python-Client
# pip install pandas
# https://github.com/vgrem/Office365-REST-Python-Client
# Author: 夕枫

from office365.runtime.auth.user_credential import UserCredential
from office365.runtime.http.request_options import RequestOptions
from office365.sharepoint.client_context import ClientContext
import os,io
import pandas as pd

site_url = "https://TenantName.sharepoint.com/sites/SiteName"

# MasterUser Mode, accessing resources with account password
ctx = ClientContext(site_url).with_credentials(UserCredential("YourMicrosoftAccount", "YourPassWord"))

# Retrieve all files under a folder, excluding files under subfolders
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
print(response.text)

# Retrieve a specific file under a folder and return information such as its properties
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
print(response.text)

# Retrieve a specific file under a folder and return binary content
request = RequestOptions("{0}/_api/web/GetFolderByServerRelativeUrl('Shared Documents/FolderName1/FolderName2/../FolderName')/Files('ExcelFileName.xlsx')/$value".format(site_url))
response = ctx.pending_request().execute_request_direct(request)
with io.BytesIO(response.content) as fh:
    df = pd.read_excel(fh, sheet_name='SheetName')
    print(df)

# If you know the URL address of a specific file, you can directly access that specific file and return binary content
request = RequestOptions("https://TenantName.sharepoint.com/sites/SiteName/Shared Documents/FolderName1/FolderName2/../FolderName/ExcelFileName.xlsx")
response = ctx.pending_request().execute_request_direct(request)
with io.BytesIO(response.content) as fh:
    df = pd.read_excel(fh, sheet_name='SheetName')
    print(df)

os.system('pause')
未经允许不得转载:夕枫 » 通过REST API读取SharePoint Online上的文件 - 主用户模式
订阅评论
提醒
guest
0 评论
内联反馈
查看所有评论