【问题标题】:U-SQL-- Read latest modified file from a folderU-SQL——从文件夹中读取最新修改的文​​件
【发布时间】:2019-12-22 13:35:58
【问题描述】:

我们如何从 U-SQL 的两个不同文件夹中读取最新修改的文​​件? 注意:文件夹中会有很多文件。但我们只想要最新的文件(单个文件)

第一个文件夹:E:\mysystem\dailyfiles\daily 第二个文件夹:E:\mysystem\weeklyfiles\weekly

DECLARE @file1 string = "dailyfiles/daily/LATESTMODIFIEDFILENAME.csv"; DECLARE @file2 string = "weeklyfiles/weekly/LATESTMODIFIEDFILENAME.csv";

DECLARE @out string = "/output/result.csv";

@数据 = 提取 col1 字符串, col2 字符串, col3 字符串, col4 字符串 来自@file1,@file2 使用 Extractors.Csv();

【问题讨论】:

  • 提供的答案有什么问题?请标记为已接受
  • 无法获取最新修改的文​​件.. 在 DateModified = FILE.MODIFIED() 处抛出错误,

标签: azure-sql-database azure-data-lake u-sql azure-data-factory-2


【解决方案1】:

所以我猜你想从两个不同的文件夹中包含许多文件(我想文件具有相同的格式)来获取最近修改的文件(最新修改的文​​件)。您应该使用文件函数和虚拟列作为动态路径

@allData =
    EXTRACT col1 string,
            col2 string,
            col3 string,
            DateModified = FILE.MODIFIED(),
            folder1 string, //virtualcolumn
            folder2 string //virtualcolumn
    FROM "mysystem/{folder1}/{folder2}/{*}.csv"
    USING Extractors.Csv();


OUTPUT
(
    SELECT col1,
           col2,
           col3
    FROM @allData AS a
         SEMIJOIN
         (
         SELECT MAX(DateModified) AS MaxFileDate
         FROM @allData
         WHERE (folder1 == "dailyfiles" AND folder2 == "daily") OR (folder1 == "weeklyfiles" AND folder2 == "weekly")
         GROUP BY DateModified
ORDER BY DateModified DESC
FETCH 1 ROWS
         ) AS b
         ON a.DateModified == b.MaxFileDate
    WHERE (folder1 == "dailyfiles" AND folder2 == "daily") OR (folder1 == "weeklyfiles" AND folder2 == "weekly")
)

【讨论】:

    猜你喜欢
    • 2021-11-22
    • 2012-06-12
    • 1970-01-01
    • 2023-03-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多