How to Extract a substring from a flowfile Name in NIFI

I have a file called 'test.abcde.houses.csv' and I want to extract the substring 'abcde' which I will use in my next processor group to query the database.

Currently, I am using the updateAttribute Processor group to try to extract the substring.

UpdateAttribute

This is the code I am using in the value section.

var userPattern = java.util.regex.Pattern.compile('(.+?)\.[0-9]{8}-[0-9]{7,9}\..+');
var userMatcher = userPattern.matcher(fileName);
var matchExists = userMatcher.matches();



var user;
var userRemove;



if (matchExists) {
user = userMatcher.group(1);
userRemove = user + ".";
}
else {
throw 'Unable to parse username from file metadata.';
}

Question:

  1. Is this the right way to extract a substring from a flow file name in NIFI?
  2. Am I using the right processor group?
  3. Does this code work with Nifi?

Solution 1:

You need to use NiFi Expression Language, a kind of NiFi's own scripting feature which provides the ability to reference attributes, compare them to other values, and manipulate their values. Please refer to this official documentation.

UpdateAttribute processor is used to update/derive new/delete attributes. So you need to use the Expression Language inside UpdateAttribute to manipulate attributes.

Example:

test.abcde.houses.csv - this is your filename and if you want to extract abcde string from filename then you can use getDelimitedField function (Expression Language string function) like below. If the expression did not evaluate, then user attribute will be having empty/null value.

Property: user (if already present then update/assign value, otherwise create new attribute)

Value: ${filename:getDelimitedField(2, '.')} (abcde is at second index/position in filename attribute value)

Expression Language has Boolean, Conditional, String Manipulation, etc. functions, so you can easily replicate your JS logic into UpdateAttribute to derive desired attribute value.