SEOClerks

How to scrub URLs from one list from another?



Write the reason you're deleting this FAQ

How to scrub URLs from one list from another?

So I have 2 files of URLs. One of them has more than the other. What I want to do is remove all the URLs that are in the first list, that are in the other list. If that makes sense?

I'll say it again just in case.

So I have 2 files of URLs right.

One has about 350+ URLs in it.

And the other list has about 100 URLs in it.

I want to remove all the URLs that are in the list with about 100 URLs, from the list of 350 URLs.

So I want to scrub the list of 350 URLs against the list of 100 URLs.

And remove any URLs in the 350 list that are in the 100 list.

Does anyone know of any free online duplicate list scrubber tool or otherwise for that?

Or perhaps some list comparison tool online or equivalent?

I used to use texthat.com for this but they went down ages ago which is a shame because they had a full suite of text tools, list comparison tools etc

Anyone know anything like that? How to scrub URLs from one list from another?

Comments

Please login or sign up to leave a comment

Join
procoder
If I'm understanding it right you want to compare two files with urls, and remove duplicates from them right? Personally I use mostly notepad++ as it does the job perfect , but I think there are a lot of duplicate remover tools online too .

I've never heard about that website before , but I thought to check it first with wayback machine, and yeah it was archive by them, and what's interesting is that you still can use it from there like you did from their domain.

However, after that I did some researches on Google I found a copy of it, actually I think it's the same website but with a different domain name. Here it is : spintaxgen.com. Hope it's that what you wanted. How to scrub URLs from one list from another?



Are you sure you want to delete this post?

idealmike
Cheers pro! Respect brother. Yeah compare and scrub or compare and remove duplicates from the small one that are in the big one. How do you scrub duplicate URLs/lines using Notepad++? Is there some tool or option for that? Never even knew it could do that! How to scrub URLs from one list from another?

Yeah it was a good site for doing stuff like this, and yeah it figures that it would still be in Waybackmachine. Cool how it still works even though it's only a cached version of it! That site you found seems to be very similar cheers for that. I also done a bit more digging and found a site that is ideal and seemingly dedicated for this kind of thing. It's www.quickdiff.com

Any lines or code or URLs that are not in the first field, are highlighted in green for you when you compare them. It's not the most ideal way around it as it puts them all together and you have to select and copy those lines/urls. But it's a work around none the less. How to scrub URLs from one list from another?



Are you sure you want to delete this post?

procoder
You're welcome mate! Yeah you need a plugin called "TextFx" to do that, it can be installed very easily from notepad++. Now to install it, open notepad++ and go to plugins, then go to plugins manager then to show plugins manager, now you'll see all available plugins that you can install, search for TextFx select it and install it, and that's it. It's a huge plugin, with very useful features, simply it's great.



Are you sure you want to delete this post?

Everett
Basically, if you want to remove duplicate lines of the same text, you can do so in cPanel alone. They utilize a javascript code for that.

How to remove duplicate text:
  1. Open your cPanel
  2. Create a new document (.html, php doesn't matter)
  3. Make sure you're in the code editor (click Code Editor)
  4. Use CTRL + F
  5. A window should popup
  6. Input the text
  7. Input the text replace text
  8. Click on Replace

This is how I do stuff like this. It's quick, and easy. Now to actually compare a list and remove duplicate lines from one list that the other list has is quite a tricky process, this can't be done with any cPanel tools so you'll probably have to use another tool to do it, or even code a script to do it for you if you know how to code.



Are you sure you want to delete this post?

idealmike
Oh yeah that's a good point actually. WordPress does a similar thing for revisions too. So you can see what is actually different/new compared to the old version. I guess I could create a post with one list, save it as a draft so a revision is saved, then edit it again and paste in the other urls and then save again so another revision is made and then compare them to see what's different as it highlights what's new in green for you. But yeah like you say, could be a bit tricky removing dupes but least would be a work around anyway. I might have to try this trick with Cpanel see it works sort of the same in practice.

I knew that you guys would have a solution for me. Surprising what you can get done when you put your mind (or other peoples minds) to it eh lol How to scrub URLs from one list from another?



Are you sure you want to delete this post?

Cristian
This may sound stupid but what if you paste everything into an Excel and just use the remove duplicate tool. It will remove every duplicate line and just keep one.
At least that's the way I'm doing it, both for URLs and anything else for that matter, especially when I do a very extensive keyword research, I will get a lot of duplicate keywords, they need to be removed and organized into groups so I always us MS Excel.

For some reason, I haven't quite adapted to Google sheets, even though the are a lot faster and you can basically do the same things you are able to do in Excel.



Are you sure you want to delete this post?

Order Now
Process Time: 0.90020799636841

Possible Duplicate queries found!
MatchCountSQLScript
SELECT * FROM `questions` as q JOIN categories_faq as c ON q.catid=c.CATID WHERE (q.status=1 OR (q.status=2 AND userid='') ) AND q.quesid='20163'1SELECT q.*, c.seo as CatSEO, c.name as CatName, c.parentid FROM `questions` as q JOIN categories_faq as c ON q.catid=c.CATID WHERE (q.status=1 OR (q.status=2 AND userid='') ) AND q.quesid='20163'

/opt/clerks-staging/docroot/viewfaq.php 496 () ()

SELECT * FROM seoclerks.members WHERE USERID='272'1SELECT * FROM seoclerks.members WHERE USERID='272'

/opt/clerks-staging/docroot/include/functions/includes/member.php 445 GetAllUserDetails() ()

SELECT * FROM categories_faq WHERE CATID='55'1SELECT CATID, seo, name, metatitle, h2header, metakeywords, metadescription, metaheader, parentid, image_name FROM categories_faq WHERE CATID='55'

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() GetCategorySeoFromType()

SELECT * FROM seoclerks.members WHERE USERID=2721SELECT googleplus_profile FROM seoclerks.members WHERE USERID=272

/opt/clerks-staging/docroot/include/functions/main.php 16914 GetGooglePlusProfileFromId() ()

SELECT * FROM ratings_faq as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=20163 LIMIT 51SELECT m.username FROM ratings_faq as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=20163 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM answers a, seoclerks.members b WHERE a.quesid='20163' AND a.userid=b.USERID and b.status='1' AND a.status=1 ORDER BY a.combined_votes DESC, a.date_answered asc1SELECT a.answer, a.USERID, a.upvotes, a.downvotes, a.ansid, a.parentid, a.combined_votes, a.date_answered, b.username, b.userlevel, b.profilepicture FROM answers a, seoclerks.members b WHERE a.quesid='20163' AND a.userid=b.USERID and b.status='1' AND a.status=1 ORDER BY a.combined_votes DESC, a.date_answered asc

/opt/clerks-staging/docroot/viewfaq.php 1245 () ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110020 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110020 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110041 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110041 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110087 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110087 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110061 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110061 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110064 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110064 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

SELECT * FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110112 LIMIT 51SELECT m.username FROM ratings_faqanswers as r, seoclerks.members as m WHERE r.USERID=m.USERID AND r.upvote=1 AND r.PID=110112 LIMIT 5

/opt/clerks-staging/docroot/include/functions/main.php 17159 GetVoters() ()

UPDATE questions SET total_views = total_views + 1 WHERE quesid='20163'1UPDATE questions SET total_views = total_views + 1 WHERE quesid='20163'

/opt/clerks-staging/docroot/include/functions/main.php 1765 update_Faqviewcount() ()

SELECT * FROM questions WHERE quesid!='20163' AND status='1' AND (question like '%How scrub URLs list another?%' OR question like '%scrub%' OR question like '%URLs%' OR question like '%list%' OR question like '%another?%') -- ORDER BY RAND() LIMIT 151SELECT quesid, question, seo, userid FROM questions WHERE quesid!='20163' AND status='1' AND (question like '%How scrub URLs list another?%' OR question like '%scrub%' OR question like '%URLs%' OR question like '%list%' OR question like '%another?%') -- ORDER BY RAND() LIMIT 15

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() ()

SELECT * FROM seoclerks.members WHERE USERID='1139' limit 11SELECT profilepicture FROM seoclerks.members WHERE USERID='1139' limit 1

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_member_profilepicture()

SELECT * FROM seoclerks.members WHERE USERID='134' limit 11SELECT profilepicture FROM seoclerks.members WHERE USERID='134' limit 1

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_member_profilepicture()

SELECT * FROM seoclerks.members WHERE USERID='17031' limit 11SELECT profilepicture FROM seoclerks.members WHERE USERID='17031' limit 1

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_member_profilepicture()

SELECT * FROM seoclerks.members WHERE USERID='840' limit 11SELECT profilepicture FROM seoclerks.members WHERE USERID='840' limit 1

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_member_profilepicture()

SELECT * FROM seoclerks.members WHERE USERID='8619' limit 11SELECT profilepicture FROM seoclerks.members WHERE USERID='8619' limit 1

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_member_profilepicture()

SELECT * FROM members_ledger WHERE script='/opt/clerks-staging/docroot/viewfaq.php' AND querystring LIKE '%id=20163%' AND added>=UNIX_TIMESTAMP(NOW())-1200 GROUP BY USERID 1SELECT USERID, username FROM members_ledger WHERE script='/opt/clerks-staging/docroot/viewfaq.php' AND querystring LIKE '%id=20163%' AND added>=UNIX_TIMESTAMP(NOW())-1200 GROUP BY USERID

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() ()

SELECT * FROM categories1SELECT * FROM categories

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() parseRedundantQueriesCache()

select * from categories_software order by name asc1select * from categories_software order by name asc

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_GetSoftwareCategories()

select * from categories_wanttobuy order by name asc1select * from categories_wanttobuy order by name asc

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_wantcategories()

select * from categories_wanttotrade order by name asc1select * from categories_wanttotrade order by name asc

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_get_tradecategories()

SELECT * FROM seoclerks.members WHERE USERID='377074.png'1SELECT profilepicture FROM seoclerks.members WHERE USERID='377074.png'

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() getUserProfileImage()

SELECT * FROM seoclerks.members WHERE USERID='272.jpg'1SELECT profilepicture FROM seoclerks.members WHERE USERID='272.jpg'

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() getUserProfileImage()

SELECT * FROM seoclerks.members WHERE USERID='2951.jpg'1SELECT profilepicture FROM seoclerks.members WHERE USERID='2951.jpg'

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() getUserProfileImage()

SELECT * FROM seoclerks.members WHERE USERID='134.png'1SELECT profilepicture FROM seoclerks.members WHERE USERID='134.png'

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() getUserProfileImage()

select * from categories_faq order by name asc1select * from categories_faq order by name asc

/opt/clerks-staging/docroot/libraries/adodb5/adodb.inc.php 1899 CacheExecute() insert_GetFaqCategories()

Invalid SQL

count(*)sql1error_msg

Expensive SQL

Tuning the following SQL could reduce the server load substantially
LoadCountSQLMaxMin

Suspicious SQL

The following SQL have high average execution times
Avg TimeCountSQLMaxMin