TL;DR: - Tools you will need for this (official sites for each for security reasons; if a site dies or gets hacked, do not download, also check signatures where available):
- [
www.bulkrenameutility.co.uk]
Bulk Rename Utility (mandatory)
- [
www.f2ko.de]
Duplicate File Eraser (optional but recommended)
- [
www.voidtools.com]
Everything (optional but very recommended)
QUOTE(Red_Piotrus @ Sep 10 2018, 04:02)
If you try to upload a gallery that contains files with the same name, the system will only upload one of them, and reject others.
That's by design as Tenboro pointed out.
QUOTE
This is a problem when uploading some zip archives from DLsite or such, some artists tend to have the annoying file structure, in extreme cases it can be dozens of folders, each with a file of the same name (different image, same name, ex. folder1/001.jpg, folder2/001.jpg, etc.). This requires manual renaming (yeah, I know, batch renaming is a thing, but if you are dealing with bunch of folders with a single image, I don't think any batch renaming can help much).
YMMV, but Bulk Rename Utility (see wiki) does a perfect job with this. If you're extra paranoid (or because the software is just that good!) then you should also use Everything. And lastly, you have Duplicate File Eraser for hash-wise dupes!
If you also need to pad with zeroes, BRU can do that for you. You don't need any more than (int) log10(2000)+1 = 4 digits as a prefix/suffix (0000-1999, or 0001-2000 depending on how you choose to index).
Interesting note: Should someone grab your gallery with the doggiebag archiver, if it's padded in the same way the DBA pads the files, they will be left as is!
Let's start with Bulk Rename Utility.
- Before proceeding and if you have enough free disk space, backup the work (copy the folder, keep the archive, whatever, just make sure you can restart from scratch in case shit happens). There is no undo after you rename, so you'll have to delete the files if you screw up.
- Run Bulk Rename Utility.
- Browse to the work's root directory.
- Under
Selections (12), bottom-right, tick
Subfolders to display the content of subfolders as well.
- Under
Append Folder Name (9), change
Name to
Prefix,
Sep. (separator) to the separator you wish to use (I go with underscores, but your call). Default depth
Levels is 1. If you need to go deeper than that, increase it accordingly.
- Now you're ready to proceed. Select all files that show up and make sure the new name matches the folder structure they come from (and there is no repeat). If there is a repeat, you have many ways to bypass this, pretty sure you'll find an original way on your own if depth itself doesn't work. It's pretty much instant.
- Once the files are renamed, read below. You'll want to grab all files in one go to make things faster, and if you did everything right, they're all named just fine so you don't have to worry about anything anymore.
- Done? Exit BRU.
If you're asking yourself where Everything comes into play: it helps you grab the files after you rename them
en masse (this way, you don't have to check each folder; you can just use Everything and cut-paste into a temporary one with all files). Here are the steps for Everything:
- Run Everything. We'll assume you have an /ehtemp/ folder ready to paste the files into (you can create it now or later, doesn't matter). We'll also assume your freshly renamed folder (with now empty subfolders) is named /mystuff/.
- Tip: Add
file: to the query to force a file-only search (
Help>Search Syntax).
- The entire list of files that was shown in BRU will show up here too, but now you can cut (or copy) and paste them. Create a temporary folder if you haven't done so yet, then paste everything (no, not the program, all the files) in it.
- Done? Keep Everything open if you copypasted, or you can just exit it if you don't need it anymore for now.
Now, navigate to your temporary folder, and voilà! You have a ready set of files to upload, all organized, and all with unique names! You're ready to upload. Well, almost.
Need to remove duplicate files? Let's do that too. Grab Duplicate File Eraser from the official site.
- Tip: If you feel like a 3x hash check is kill (or you're doing tons of files), CRC32's the fastest but most likely to collide and SHA1's the slowest but least likely. Combine as you see fit for your specs. Using all three methods pretty much
guarantees no
collusion collision will happen.
- Before moving on, again, just to be safe, you should also copy that folder. Duplicate deletion cannot be undone.
- Run Duplicate File Eraser. Make sure it's the Unicode version. Don't run x64 if your system is not 64-bit. (Goes without saying!)
- Browse to your temporary folder.
- Go nuts and collision-check for MD5, SHA1 and CRC32 at the same time. You'll only be doing 2000 files at most, so you don't have to worry.
- Make sure
no limitation is checked for both file and filesize (unless you need that somehow). Unless you're worried about accidentally hitting your root folder (remember, it won't delete anything until you tell it to, and there is a confirmation prompt - you can always kill the process), you can leave all search options on, they're pretty useful for deeper deduping (if you have the bad habit of hiding your stuff when this does absolutely nothing, you'll definitely want
Search hidden files on, for example). If you're really paranoid, uncheck
Search system files (there's hardly a reason to dedupe that except for DLLs very rarely, lol).
- Enough text! Let's proceed. Click the magnifying glass icon to
Start searching. Depending on your CPU and disk speed, it should still be fairly fast. Be patient, won't take more than a couple minutes.
- Found dupes? Well, that's a good thing because that's what you're running it for! :lol: If you don't find any dupes, it's all good, exit and upload away!
- To get rid of dupes, carefully click
Select duplicates (this will only keep the first version of the file namewise in each "group"). If the selection doesn't satisfy you, you can invert the selection, reverse it, or mess with it manually. When ready, make sure all is well, click
Delete, and wait for the files to get nuked.
- Done? Exit DFE!
Now you're ready to upload a
perfect gallery without any dupe whatsoever.
The major upsides:
- This effectively keeps the folder structure intact by appending the folder name as a prefix (and prevents the chances of any name dupe).
- You can potentially get rid of a shit ton of hashwise dupes you didn't even suspect, and you can be damn sure they're actual dupes. Very useful for imagesets!
- You
don't have to archive anything, so there won't be any encoding conflicts. Non-ASCII filenames, should they exist, will be parsed properly by the PHP engine.
QUOTE
So bottom line, uploaders have to manually rename such files. Which works, to a point - I am sure many uploaders, perhaps even myself, missed a few files, because some archives are just a pain due to complex (assnine, annoying, dumb, chose your favorite) folder structure. (I just now uploaded a gallery, thought it had too few pictures, on a hunch checked folders, and found a few extra images with same name... how many uploaders won't double check this or miss it?)
I can reassure you on this, we all make mistakes. What's important is to
spot them ASAP and fix them. Idiots will blindvote, and everyone will be held responsible, starting with the person who started the mess. (Not trying to be offensive here; people just don't pay enough attention to details. Whatever Alpha may say,
DO sweat the small things, it helps the entire community tremendously and prevents people from having to fix a metric fuckton of old derps a decade later. (That doesn't mean spend one hour on something, that's just silly. That does however mean when you tag retarded shit like fleglock because shortcuts are fun - they aren't if you use them wrong, and nope, I am (politely) looking at the
topmost senior tag moderator here, not you, Red :) - you should ALWAYS, always double-check to make sure your actions spawn the right tag! Now that we have namespaces, there is ZERO excuse for such situations! But yes, accidents happen. They always happen. It just becomes a problem when they happen almost
all the time. ;) </rant>
QUOTE
I am not sure if the gallery upload does generate error reports when identically named files are uploaded, but some uploaders won't read or understand them. And this is, in the end, our loss, as we get incomplete galleries.
It doesn't, they get discarded.
QUOTE
A simple fix would be to automatically rename dupe files. If the uploader doesn't like how this turns out, then they can go and manually rename stuff and reupload it. This would eliminate the problem of incomplete archives.
As pointed out by Tenboro, not an option.
Well, I hope my little guide for BRU, DFE and Everything was useful. :)
- AL