Feature: Custom replacements
Reported by Loz | March 2nd, 2010 @ 10:14 PM
This feature would be a handy extension/replacement for the blacklist.
Currently it is only possible to replace blacklisted characters with one chosen character.
It would be useful to to be able to do different replacements for different characters. Allowing you to remove unwanted characters, abbreviate strings and generally shorten file names
EG:
replace "-" with "-"
remove "'" altogether
this could be configured something like the following:
<replace>
<searchstring>_-_</searchstring>
<replacement>-</replacement>
</replace>
<replace>
<searchstring>'</searchstring>
<replacement/>
</replace>
Use Case:
24 episodes will normally get named as follows:
24.S07E10.Day_7:_5:00_PM_-_6:00_PM.avi
which is a bit long winded. This functionality would allow you to shorten them as follows:
24.S07E10.Day_7_5PM-6PM.avi
or even:
24.S07E10.Day_7_1700-1800.avi
Comments and changes to this ticket
-

dbr/Ben January 17th, 2010 @ 11:44 AM
- → State changed from new to open
(I merged the fixed markdown into your original ticket)
Agreed.
I don't particularly like the current blacklist system, the
replace_blacklisted_characters_withwas intended more for invalid filesystem characters, rather than doing things like replacing spaces with.What you propose would be nicer/more flexible, and could also tie in with #18 by providing an
input_filename_replacementsalong withoutput_filename_replacements -

Loz January 18th, 2010 @ 03:35 AM
Glad you like this feature request,
I would have thought that #18 could be done in config if this feature is available.
This would also be useful in combination with #2 as you could do things like replace Season-0 with Specials.One thing to note here is that order is important, replacements should be performed in the order they occur in it becomes difficult to predict what the result will be.
-

dbr/Ben January 26th, 2010 @ 01:33 AM
I've added the ability to (de)serialise dictionaries the the config class. This feature should now be trivial enough to implement, although I'm waiting until thetvdb.com is backup before continuing to do much more.
The config will likely look like:
<option name="input_replacements" type="list"> <value type="dict"> <item key="is_regex"> <value type="bool">False</value> </item> <item key="with"> <value type="string">b</value> </item> <item key="replace"> <value type="string">a</value> </item> </value> <value type="dict"> <item key="is_regex"> <value type="bool">True</value> </item> <item key="with"> <value type="string">y</value> </item> <item key="replace"> <value type="string">x</value> </item> </value> </option>A bit convoluted, but with the current XML generation I can't really make it much tidier, I don't think.. Basically the above maps to..
'input_replacements': [ {'replace': 'a', 'with': 'b', 'is_regex': False}, {'replace': 'x', 'with': 'y', 'is_regex': True}, ]I'm almost tempted to replace the dictionary config with JSON, since the above Python would be encoded as:
'{"input_replacements": [{"is_regex": false, "with": "b", "replace": "a"}, {"is_regex": true, "with": "y", "replace": "x"}]}'..which is a bit more sane
-

dbr/Ben January 28th, 2010 @ 08:19 AM
..I'm very tempted to switch the config to JSON - already implemented it in a branch, http://github.com/dbr/tvnamer/tree/newargparse
The command line argument parsing is considerably improved, resulted in much simpler/shorter code and more concise configs. Existing XML configs will no longer work (it should reliably error when you try and use one), but it shouldn't be much hassle to copy over existing values
-

dbr/Ben January 31st, 2010 @ 09:41 AM
- → State changed from open to resolved
(from [c97f37098262b6c38cbf451edc56ba4e10634504]) Add custom replacements config feature [#24 state:resolved] [#18] http://github.com/dbr/tvnamer/commit/c97f37098262b6c38cbf451edc56ba...
-

dbr/Ben January 31st, 2010 @ 09:56 AM
This is now implemented. To take the example from ticket #18
Config file:
{"input_filename_replacements": [ {"is_regex": true, "match": "( and | & )", "replacement": " "} ], "output_filename_replacements": [ {"is_regex": false, "match": " & ", "replacement": " and "} ]}Example output:
$ tvnamer --batch --config example.json Law\ \&\ Order\ s01e01.avi Loading config: example.json #################### # Starting tvnamer # Found 1 episodes #################### # Processing file: Law & Order s01e01.avi # With custom replacements: Law Order s01e01.avi # Detected series: Law Order (season: 1, episode: 1) #################### Old filename: Law & Order s01e01.avi Before custom output replacements: Law & Order - [01x01] - Prescription For Death.avi New filename: Law and Order - [01x01] - Prescription For Death.aviNot entirely sure about the wording for "Before custom output replacements" (it's a bit long), but this can be changed later without breaking anything - the functionality should be there.
Until I update the readme, further examples can be found in the relevant test: http://github.com/dbr/tvnamer/blob/master/tests/test_custom_replace...
Basically, you have replacements for input and output files, and they can be simple string replacements (
str.replace), or regular expressions (re.sub) -

Loz February 18th, 2010 @ 08:40 AM
Hi
I'm having a bit of trouble with this feature:
Maybe I'm doing it wrong but I'd have exspected the following:
{"is_regex": true, "match": "\.Episode_[0-9]+\.", "replacement": "."}]To match .Episode_1. and replace it with a single dot. But it just outputs
Error loading config: Invalid \escape: '.': line 35 column 50 (char 5594)Any advice.
-

dbr/Ben February 18th, 2010 @ 11:28 AM
I think you need to double-up the backslashes before the
.{"is_regex": true, "match": "\\.Episode_[0-9]+\\.", "replacement": "."}]..otherwise
\.is treated as an escape sequence like\n -

Loz February 21st, 2010 @ 10:26 AM
Hi
I got chance to test this again today, unfortunately your suggestion does not seem to work.
It does not seem to match
My config contains the following:
{"is_regex": true, "match": "\\.Episode_[0-9]\\.", "replacement": "."}but that does not seem to match
#################### # Starting tvnamer # Found 4 episodes #################### # Processing file: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv # Detected series: Secret Diary of a Call Girl (season: 3, episode: 1) #################### Old filename: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv Before custom output replacements: Secret Diary of a Call Girl.S03E01.Episode 1.mkv New filename: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv Rename?Any other ideas to try?
-

dbr/Ben February 22nd, 2010 @ 11:03 AM
Oh, you are trying to match the
.which is considered part of the extension, but the custom replacement function is only givenSecret Diary of a Call Girl.S03E01.Episode 1- it does not include.mkv(as I didn't think there would be any case where you want to modify the extension with a replacement), so..{ "filename_with_episode": "%(seriesname)s.S%(seasonno)02dE%(episode)s.%(episodename)s%(ext)s", "output_filename_replacements": [ {"is_regex": false, "match": " ", "replacement": "_"}, {"is_regex": true, "match": "\\.Episode_[0-9]", "replacement": ""} ] }Gives the following output:
Old filename: Secret_Diary_of_a_Call_Girl.S03E01.mkv Before custom output replacements: Secret Diary of a Call Girl.S03E01.Episode 1.mkv New filename: Secret_Diary_of_a_Call_Girl.S03E01.mkvTo be clear:
\\.in the JSON config is treated as a literal.in the regex. Likere.compile("\.")- A lone
.in the JSON config is treated as a regex "any character" symbol. Likere.compile(".") \.in the JSON config is a JSON syntax error.
-

Wob March 1st, 2010 @ 02:23 PM
Hi,
I am having some problem with case sensitivity. I'm sure it's something simple I am missing.
I can't get the system to do a case insensitive search, I have tried adding /i in various ways without success.
I am trying to process the file "CSI.S10E13.Internal.Combustion.HDTV.XviD-FQM.[VTV].avi"
I need to convert the CSI to CSI.Crime.Scene.Investigation otherwise tvnamer doesn't find the show.
To ensure that it doesn't do the same replacement for CSI.New.York I am searching for "CSI.S" and replacing with "CSI.Crime.Scene.Investigation.S" but I need the search to be case insensitive as sometimes the files are upper-case sometimes lower-case. I could create a second rule, but I assume being reg ex it should be doable.
This is my current working rule;
{"is_regex": true, "match": "CSI.S", "replacement": "CSI.Crime.Scene.Investigation.S"}but it doesn't work if the case is wrong.
Cheers,
Beau -

dbr/Ben March 2nd, 2010 @ 10:14 PM
Hm, aside from doing
[cC][sS][iI] etc, this isn't currently possibleRegex's in Python aren't quite like Perl where you can add the flags with
/i, they are specified by an argument to there.compilecall.I shall add another option to the dictionary to specify flags, I've made a separate ticket for this: #40
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
Referenced by
-
#18 should remove "&", "and" from series name before search
Ticket #24 suggested a feature that could solve this -
cu...
-
#28 Assign custom tvdb ids to certain filenames
Could this be fixed with the feature described in #24 ? Y...
-
#28 Assign custom tvdb ids to certain filenames
In the mean time, you could always use a some kind of
bat...
-
#24 Feature: Custom replacements
(from [c97f37098262b6c38cbf451edc56ba4e10634504])
Add cus...
-
#18 should remove "&", "and" from series name before search
(from [c97f37098262b6c38cbf451edc56ba4e10634504])
Add cus...
-
#18 should remove "&", "and" from series name before search
Ticket #24 is implemented, you can now achieve this by
ad...
-
#26 Additional rules needed for new filename formats
Resolving the ticket, as I assume this is working. Commen...
-
#40 Add re flags ability to custom replacements
From
this comment: