FilesystemFdw: Enhance pattern matching and timestamp support#205
Closed
rvernica wants to merge 9 commits intoSegfault-Inc:masterfrom
rvernica:master
Closed
FilesystemFdw: Enhance pattern matching and timestamp support#205rvernica wants to merge 9 commits intoSegfault-Inc:masterfrom rvernica:master
rvernica wants to merge 9 commits intoSegfault-Inc:masterfrom
rvernica:master
Conversation
* Allow user to use regular expression syntax in the pattern
* Track the actual filename matched in the Item since I can't be
recomputed from the pattern
* Example pattern supported now:
pattern '{taz}/{foo}[ _-]{bar}\.(jpe?g|png)',
allows for tokes to be separated by " ", "-", or "_" and allows the
extension to be "jpg", "jpeg", and "png"
* ignore_case option set to FALSE by default * used in re.compile to allow for case insensitive regular expression matches
* Extract mtime and ctime from file using os.stat
* Propagate mtime and ctime to Item and convert to datetime
* Add mtime and ctime column options
* Propagate and update mtime and ctime throughout FilesystemFdw
* Example:
CREATE FOREIGN TABLE foo (
filename VARCHAR,
mtime TIMESTAMP,
ctime TIMESTAMP,
foo VARCHAR
) SERVER filesystem_srv OPTIONS (
root_dir '/f',
pattern '{foo}.zip',
filename_column 'filename',
mtime_column 'mtime',
ctime_column 'ctime');
* Set actual_filename on from_filename Items
* Add set_timestamp function to Item to set mtime and ctime
* Used by constructor
* Used by execute
* execute function:
* use isfile to check if file exists (consistent with
StructuredDirectory
* Get file timestamps and set them in the item
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Highlights:
patternoptionThis PR adds four more options to
FilesystemFdw:escape_pattern(default:TRUE)If TRUE, the pattern used to match files is escaped before it is
used for regular expression matching. If FALSE, the pattern used to
match files is used as is and it is assumed to be a valid regular
expression.
ignore_case(default:FALSE)If FALSE, the pattern used to match files is case sensitive. If
TRUE, the pattern used to match files is case insensitive.
mtime_columnIf set, defines which column will contain the file mtime.
ctime_columnIf set, defines which column will contain the file ctime.
With this PR, the following example is supported:
CREATE FOREIGN TABLE foo ( filename VARCHAR, mtime TIMESTAMP, ctime TIMESTAMP, foo VARCHAR, bar VARCHAR, taz VARCHAR ) SERVER filesystem_srv OPTIONS ( root_dir '/f', pattern '{taz}/{foo}[ _-]{bar}\.(jpe?g|png)', escape_pattern 'FALSE', ignore_case 'TRUE', filename_column 'filename', mtime_column 'mtime', ctime_column 'ctime'); SELECT * FROM foo;Notice the regular expression used for the
patternoption which allows for thefooandbartokens to be separated by,_, or-. Also, the file extensions can be.jpg,.jpeg, or.png, case insensitive.Fix for #203