【发布时间】:2019-09-24 09:47:24
【问题描述】:
我需要每周从 Oracle 导出一个相当大的 CSV 文件。
我尝试了两种方法。
- Adapter.fill(数据集)
- 循环遍历列和行以一次一行保存到 CSV 文件中。
第一个在运行时内存不足(服务器机器只有 4 GB 的 RAM),第二个大约需要一个小时,因为要导出超过 400 万行。
这里是代码 #1:
#Your query. It cannot contain any double quotes otherwise it will break.
$query = "SELECT manycolumns FROM somequery"
#Oracle login credentials and other variables
$username = "username"
$password = "password"
$datasource = "database address"
$output = "\\NetworkLocation\Sales.csv"
#creates a blank CSV file and make sure it's in ASCI
Out-File $output -Force ascii
#This here will look for "Oracle.ManagedDataAccess.dll" file inside "C:\Oracle" folder. We usually have two versions of Oracle installed so the adaptor can be in different locations. Needs changing if the Oracle is installed elsewhere.
$location = Get-ChildItem -Path C:\Oracle -Filter Oracle.ManagedDataAccess.dll -Recurse -ErrorAction SilentlyContinue -Force
#Establishes connection to Oracle using the DLL file
Add-Type -Path $location.FullName
$connectionString = 'User Id=' + $username + ';Password=' + $password + ';Data Source=' + $datasource
$connection = New-Object Oracle.ManagedDataAccess.Client.OracleConnection($connectionString)
$connection.open()
$command=$connection.CreateCommand()
$command.CommandText=$query
#Creates a table in memory and fills it with results from the query. Then, export the virtual table into CSV.
$DataSet = New-Object System.Data.DataSet
$Adapter = New-Object Oracle.ManagedDataAccess.Client.OracleDataAdapter($command)
$Adapter.Fill($DataSet)
$DataSet.Tables[0] | Export-Csv $output -NoTypeInformation
$connection.Close()
这是#2
#Your query. It cannot contain any double quotes otherwise it will break.
$query = "SELECT manycolumns FROM somequery"
#Oracle login credentials and other variables
$username = "username"
$password = "password"
$datasource = "database address"
$output = "\\NetworkLocation\Sales.csv"
$tempfile = $env:TEMP + "\Temp.csv"
#creates a blank CSV file and make sure it's in ASCI
Out-File $tempfile -Force ascii
#This here will look for "Oracle.ManagedDataAccess.dll" file inside "C:\Oracle" folder. Needs changing if the Oracle is installed elsewhere.
$location = Get-ChildItem -Path C:\Oracle -Filter Oracle.ManagedDataAccess.dll -Recurse -ErrorAction SilentlyContinue -Force
#Establishes connection to Oracle using the DLL file
Add-Type -Path $location.FullName
$connectionString = 'User Id=' + $username + ';Password=' + $password + ';Data Source=' + $datasource
$connection = New-Object Oracle.ManagedDataAccess.Client.OracleConnection($connectionString)
$connection.open()
$command=$connection.CreateCommand()
$command.CommandText=$query
#Reads results column by column. This way you don't have to specify how many columns it has.
$reader = $command.ExecuteReader()
while($reader.Read()) {
$props = @{}
for($i = 0; $i -lt $reader.FieldCount; $i+=1) {
$name = $reader.GetName($i)
$value = $reader.item($i)
$props.Add($name, $value)
}
#Exports each line to CSV file. Works best when the file is on local drive as it saves it after each line.
new-object PSObject -Property $props | Export-Csv $tempfile -NoTypeInformation -Append
}
Move-Item $tempfile $output -Force
$connection.Close()
理想情况下,我想使用第一个代码,因为它比第二个代码快得多,但可以避免内存不足。
你们知道是否有某种方法可以“填充”前 100 万条记录、将它们附加到 CSV、清理“DataSet”表、接下来的 100 万条记录等?代码运行完 CSV 后,权重约为 1.3 GB,但当它运行时,即使 8 GB 的内存也不够用(我的笔记本电脑有 8 GB,但服务器只有 4 GB,真的很难)。
任何提示将不胜感激。
【问题讨论】:
-
把Oracle itself 告诉create a CSV 文件怎么样?这会表现得更好,因为数据库引擎会在本地完成所有繁重的工作。
-
您是否需要 Oracle 的“管理员”权限才能执行此操作?我的团队只有“读取”权限,因为数据库由第三方公司拥有和更新,我们支付大量资金只是为了进行简单的更改。
-
Oracle 权限问题在DBA.SE 上会更好。考虑在那里发布一个关于如何进行 CSV 导出的全新问题,也许还有关于这些问题的最佳实践。
-
我对权限没有任何疑问。我只想使用 Windows 调度程序每周一次将查询结果导出到 1.3 GB 的 CSV 文件中。
标签: oracle powershell csv export-to-csv