#24 ✓resolved
Loz

Feature: Custom replacements

Reported by Loz | January 17th, 2010 @ 04:50 AM

This feature would be a handy extension/replacement for the blacklist.

Currently it is only possible to replace blacklisted characters with one chosen character.

It would be useful to to be able to do different replacements for different characters. Allowing you to remove unwanted characters, abbreviate strings and generally shorten file names

EG:

replace "-" with "-"
remove "'" altogether

this could be configured something like the following:

<replace>
<searchstring>_-_</searchstring>
<replacement>-</replacement>
</replace>
<replace>
<searchstring>'</searchstring>
<replacement/>
</replace>

Use Case:
24 episodes will normally get named as follows:

24.S07E10.Day_7:_5:00_PM_-_6:00_PM.avi

which is a bit long winded. This functionality would allow you to shorten them as follows:

24.S07E10.Day_7_5PM-6PM.avi

or even:

24.S07E10.Day_7_1700-1800.avi

Comments and changes to this ticket

  • dbr/Ben

    dbr/Ben January 17th, 2010 @ 11:44 AM

    • State changed from “new” to “open”

    (I merged the fixed markdown into your original ticket)

    Agreed.

    I don't particularly like the current blacklist system, the replace_blacklisted_characters_with was intended more for invalid filesystem characters, rather than doing things like replacing spaces with .

    What you propose would be nicer/more flexible, and could also tie in with #18 by providing an input_filename_replacements along with output_filename_replacements

  • Loz

    Loz January 18th, 2010 @ 03:35 AM

    Glad you like this feature request,

    I would have thought that #18 could be done in config if this feature is available.
    This would also be useful in combination with #2 as you could do things like replace Season-0 with Specials.

    One thing to note here is that order is important, replacements should be performed in the order they occur in it becomes difficult to predict what the result will be.

  • dbr/Ben

    dbr/Ben January 26th, 2010 @ 01:33 AM

    I've added the ability to (de)serialise dictionaries the the config class. This feature should now be trivial enough to implement, although I'm waiting until thetvdb.com is backup before continuing to do much more.

    The config will likely look like:

    <option name="input_replacements" type="list">
      <value type="dict">
          <item key="is_regex">
              <value type="bool">False</value>
            </item>
            <item key="with">
              <value type="string">b</value>
            </item>
            <item key="replace">
              <value type="string">a</value>
            </item>
        </value>
        <value type="dict">
          <item key="is_regex">
              <value type="bool">True</value>
            </item>
            <item key="with">
              <value type="string">y</value>
            </item>
            <item key="replace">
              <value type="string">x</value>
            </item>
        </value>
    </option>
    

    A bit convoluted, but with the current XML generation I can't really make it much tidier, I don't think.. Basically the above maps to..

    'input_replacements': [
        {'replace': 'a', 'with': 'b', 'is_regex': False},
        {'replace': 'x', 'with': 'y', 'is_regex': True},
    ]
    

    I'm almost tempted to replace the dictionary config with JSON, since the above Python would be encoded as:

    '{"input_replacements": [{"is_regex": false, "with": "b", "replace": "a"}, {"is_regex": true, "with": "y", "replace": "x"}]}'
    

    ..which is a bit more sane

  • dbr/Ben

    dbr/Ben January 28th, 2010 @ 08:19 AM

    ..I'm very tempted to switch the config to JSON - already implemented it in a branch, http://github.com/dbr/tvnamer/tree/newargparse

    The command line argument parsing is considerably improved, resulted in much simpler/shorter code and more concise configs. Existing XML configs will no longer work (it should reliably error when you try and use one), but it shouldn't be much hassle to copy over existing values

  • dbr/Ben

    dbr/Ben January 31st, 2010 @ 09:41 AM

    • State changed from “open” to “resolved”
  • dbr/Ben

    dbr/Ben January 31st, 2010 @ 09:56 AM

    This is now implemented. To take the example from ticket #18

    Config file:

    {"input_filename_replacements": [
        {"is_regex": true,
        "match": "( and | & )",
        "replacement": " "}
    ],
    "output_filename_replacements": [
        {"is_regex": false,
        "match": " & ",
        "replacement": " and "}
    ]}
    

    Example output:

    $ tvnamer --batch --config example.json Law\ \&\ Order\ s01e01.avi
    Loading config: example.json
    ####################
    # Starting tvnamer
    # Found 1 episodes
    ####################
    # Processing file: Law & Order s01e01.avi
    # With custom replacements: Law Order s01e01.avi
    # Detected series: Law Order (season: 1, episode: 1)
    ####################
    Old filename: Law & Order s01e01.avi
    Before custom output replacements: Law & Order - [01x01] - Prescription For Death.avi
    New filename: Law and Order - [01x01] - Prescription For Death.avi
    

    Not entirely sure about the wording for "Before custom output replacements" (it's a bit long), but this can be changed later without breaking anything - the functionality should be there.

    Until I update the readme, further examples can be found in the relevant test: http://github.com/dbr/tvnamer/blob/master/tests/test_custom_replace...

    Basically, you have replacements for input and output files, and they can be simple string replacements (str.replace), or regular expressions (re.sub)

  • Loz

    Loz February 18th, 2010 @ 08:40 AM

    Hi

    I'm having a bit of trouble with this feature:

    Maybe I'm doing it wrong but I'd have exspected the following:

    {"is_regex": true,
    "match": "\.Episode_[0-9]+\.",
    "replacement": "."}]
    

    To match .Episode_1. and replace it with a single dot. But it just outputs

    Error loading config: Invalid \escape: '.': line 35 column 50 (char 5594)
    

    Any advice.

  • dbr/Ben

    dbr/Ben February 18th, 2010 @ 11:28 AM

    I think you need to double-up the backslashes before the .

    {"is_regex": true,
    "match": "\\.Episode_[0-9]+\\.",
    "replacement": "."}]
    

    ..otherwise \. is treated as an escape sequence like \n

  • Loz

    Loz February 21st, 2010 @ 10:26 AM

    Hi

    I got chance to test this again today, unfortunately your suggestion does not seem to work.

    It does not seem to match

    My config contains the following:

    {"is_regex": true, "match": "\\.Episode_[0-9]\\.", "replacement": "."}
    

    but that does not seem to match

    ####################
    # Starting tvnamer
    # Found 4 episodes
    ####################
    # Processing file: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv
    # Detected series: Secret Diary of a Call Girl (season: 3, episode: 1)
    ####################
    Old filename: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv
    Before custom output replacements: Secret Diary of a Call Girl.S03E01.Episode 1.mkv
    New filename: Secret_Diary_of_a_Call_Girl.S03E01.Episode_1.mkv
    Rename?
    

    Any other ideas to try?

  • dbr/Ben

    dbr/Ben February 22nd, 2010 @ 11:03 AM

    Oh, you are trying to match the . which is considered part of the extension, but the custom replacement function is only given Secret Diary of a Call Girl.S03E01.Episode 1 - it does not include .mkv (as I didn't think there would be any case where you want to modify the extension with a replacement), so..

    {
        "filename_with_episode": "%(seriesname)s.S%(seasonno)02dE%(episode)s.%(episodename)s%(ext)s", 
        "output_filename_replacements": [
            {"is_regex": false, "match": " ", "replacement": "_"},
            {"is_regex": true, "match": "\\.Episode_[0-9]", "replacement": ""}
        ]
    }
    

    Gives the following output:

    Old filename: Secret_Diary_of_a_Call_Girl.S03E01.mkv
    Before custom output replacements: Secret Diary of a Call Girl.S03E01.Episode 1.mkv
    New filename: Secret_Diary_of_a_Call_Girl.S03E01.mkv
    

    To be clear:

    • \\. in the JSON config is treated as a literal . in the regex. Like re.compile("\.")
    • A lone . in the JSON config is treated as a regex "any character" symbol. Like re.compile(".")
    • \. in the JSON config is a JSON syntax error.
  • Wob

    Wob March 1st, 2010 @ 02:23 PM

    Hi,

    I am having some problem with case sensitivity. I'm sure it's something simple I am missing.

    I can't get the system to do a case insensitive search, I have tried adding /i in various ways without success.

    I am trying to process the file "CSI.S10E13.Internal.Combustion.HDTV.XviD-FQM.[VTV].avi"

    I need to convert the CSI to CSI.Crime.Scene.Investigation otherwise tvnamer doesn't find the show.

    To ensure that it doesn't do the same replacement for CSI.New.York I am searching for "CSI.S" and replacing with "CSI.Crime.Scene.Investigation.S" but I need the search to be case insensitive as sometimes the files are upper-case sometimes lower-case. I could create a second rule, but I assume being reg ex it should be doable.

    This is my current working rule;

         {"is_regex": true,
         "match": "CSI.S",
         "replacement": "CSI.Crime.Scene.Investigation.S"}
    

    but it doesn't work if the case is wrong.

    Cheers,
    Beau

  • dbr/Ben

    dbr/Ben March 2nd, 2010 @ 10:14 PM

    Hm, aside from doing [cC][sS][iI] etc, this isn't currently possible

    Regex's in Python aren't quite like Perl where you can add the flags with /i, they are specified by an argument to the re.compile call.

    I shall add another option to the dictionary to specify flags, I've made a separate ticket for this: #40

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Shared Ticket Bins

People watching this ticket

Referenced by

Pages