+ All Categories
Home > Documents > Corrupt Free File Transfer Over FTP

Corrupt Free File Transfer Over FTP

Date post: 25-Nov-2015
Category:
Upload: prasanth-karri
View: 23 times
Download: 2 times
Share this document with a friend
Description:
How to Transfer file over FTP with 100% reliability. Continue Reading, to know about it more.
Popular Tags:
6
CFiTT - Corrupt Free File Transfer Technique over FTP Amna Sultana, Muhammad Farhan Bashir, Muhammad Abdul Qadir Centre for Distributed and Semantic Computing (CDSC), Department of Computer Science, Mohammad Ali Jinnah University, Islamabad Campus, Pakistan amnasultana@gmail. com, m.farhan. bashir@gmail. com, aqadir@d/innah. edu.pk Abstract valuable information. These files can be variable in size and importance to its users. One of the concerns of Communication is one of the basic necessities of users transmitting files is that the file should reach human beings. File transfer is one of the basicforms of accurately to the destination. That is, the file should communication. Reliability is the key issue raised due reach complete, and uncorrupted. to complex nature of network and growth of computer File transfer protocol FTP commonly used protocol science. In this paper we have devised a technique for for transferring file over TCP allows platform file transfer which identifies whether some portion of independent data exchange [1, 2]. Size of information the file is received corrupt or not, and if yes then is getting increased with the growth in the field of exactly what portion is corrupt. This technique computer science. Networks are growing rapidly and provides reliability by eliminating the corruption from rushed with traffic by every passing moment. So, on a file, hence requiring less bandwidth of the network one hand the information being sent to the network by reducing the amount of data to be re-sent in case of may not either always reaches to its destination or may corruption. The reliability is ensured with the help of be corrupted during transmission. On the other hand file signature generation method which we have the reliability of networks is never assured. Suppose devised in this paper. The beauty of this technique that that we are sending a file to some destination, and it generates hashes which are not easy to break, hence during the transmission of file network link goes down. ensuring security of the file. We have used TCP as the Now if the file size is too small it can be resend easily, underlying protocol, whereas TCP is already but what if you were sending a large file size (giga considered to be reliable, but the fact is that it does not bits) and almost at very last moments of transfer the ensure the reliable transfer over the network due to the file got corrupted. The situation becomes much worse fact that it uses CRC which is still vulnerable to if you have to resend the whole file. Similar kind of network conditions and malicious attacks. Our situation arises if you are sending/receiving important technique operates at the application layer and tries to data and during transmission any kind of damage finish the cope up with the reliability over the file occurs to the file due to any reasons, the purpose of transfer. We have also developed a prototype to test sending information dies out, and perhaps we need to the integrity of our technique. Empirical results ensure retransfer the whole file again from the scratch. the reliability of our technique. The emphasis of this In this paper we will present an algorithm that is paper is to provide users with the corrupt free file specifically designed to overcome the drawbacks transfer over the network, so that their time and mentioned above. We call our technique as Corrupt valuable resources might be saved Free File Transfer using FTP. The purpose of our technique is to provide reliability in transfer, and over 1. Introduction come the problems which normally arise due to the underlying network. We have specially developed using file signature method this technique for the users One of the basic necessities of living things in this transferring large file size. Other users are also the universe is the communication. Information sharing is targeted users for this technique. Our technique is being done in many different forms and mediums like, capable of identifying exact location of error at signals, voice, and text etc. File sharing one of the receiving site. This capability enables our technique to basic forms of sharing, is being done since the human ask for the retransmission of only that part of beings have started maintaining records. Files being application which is received corrupted and embed in transferred over the transmission medium contain Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.
Transcript
  • CFiTT - Corrupt Free File Transfer Technique over FTP

    Amna Sultana, Muhammad Farhan Bashir, Muhammad Abdul QadirCentre for Distributed and Semantic Computing (CDSC), Department ofComputer Science,

    Mohammad Ali Jinnah University, Islamabad Campus, Pakistanamnasultana@gmail. com, m.farhan. bashir@gmail. com, aqadir@d/innah. edu.pk

    Abstract valuable information. These files can be variable insize and importance to its users. One of the concerns of

    Communication is one of the basic necessities of users transmitting files is that the file should reachhuman beings. File transfer is one of the basicforms of accurately to the destination. That is, the file shouldcommunication. Reliability is the key issue raised due reach complete, and uncorrupted.to complex nature of network and growth of computer File transfer protocol FTP commonly used protocolscience. In this paper we have devised a technique for for transferring file over TCP allows platformfile transfer which identifies whether some portion of independent data exchange [1, 2]. Size of informationthe file is received corrupt or not, and if yes then is getting increased with the growth in the field ofexactly what portion is corrupt. This technique computer science. Networks are growing rapidly andprovides reliability by eliminating the corruption from rushed with traffic by every passing moment. So, ona file, hence requiring less bandwidth of the network one hand the information being sent to the networkby reducing the amount of data to be re-sent in case of may not either always reaches to its destination or maycorruption. The reliability is ensured with the help of be corrupted during transmission. On the other handfile signature generation method which we have the reliability of networks is never assured. Supposedevised in this paper. The beauty of this technique that that we are sending a file to some destination, andit generates hashes which are not easy to break, hence during the transmission of file network link goes down.ensuring security of the file. We have used TCP as the Now if the file size is too small it can be resend easily,underlying protocol, whereas TCP is already but what if you were sending a large file size (gigaconsidered to be reliable, but thefact is that it does not bits) and almost at very last moments of transfer theensure the reliable transfer over the network due to the file got corrupted. The situation becomes much worsefact that it uses CRC which is still vulnerable to if you have to resend the whole file. Similar kind ofnetwork conditions and malicious attacks. Our situation arises if you are sending/receiving importanttechnique operates at the application layer and tries to data and during transmission any kind of damagefinish the cope up with the reliability over the file occurs to the file due to any reasons, the purpose oftransfer. We have also developed a prototype to test sending information dies out, and perhaps we need tothe integrity of our technique. Empirical results ensure retransfer the whole file again from the scratch.the reliability of our technique. The emphasis of this In this paper we will present an algorithm that ispaper is to provide users with the corrupt free file specifically designed to overcome the drawbackstransfer over the network, so that their time and mentioned above. We call our technique as Corruptvaluable resources might be saved Free File Transfer using FTP. The purpose of our

    technique is to provide reliability in transfer, and over1. Introduction come the problems which normally arise due to theunderlying network. We have specially developed

    using file signature method this technique for the usersOne of the basic necessities of living things in this transferring large file size. Other users are also the

    universe is the communication. Information sharing is targeted users for this technique. Our technique isbeing done in many different forms and mediums like, capable of identifying exact location of error atsignals, voice, and text etc. File sharing one of the receiving site. This capability enables our technique tobasic forms of sharing, is being done since the human ask for the retransmission of only that part ofbeings have started maintaining records. Files being application which is received corrupted and embed intransferred over the transmission medium contain

    Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.

  • the file at its position. This technique helps us saving a enough to identify the exact location of mismatch.lot of time and valuable resources. They are only made match perfectly.

    Rest of this paper is organized in the following Du [6] represents the extendable signatures. This ismanner: Section II contains the survey of existing file technique was introduced in the early days of filetransfer techniques, Section III contains the proposed signature paradigm. The technique is providing onlytechnique, Section IV contains the empirical test the basic functionality of hashing and signature files.results of proposed technique and Section V concludes Li [7] presented a variation of signature files. Thethe paper. focus of this research activity is just towards the

    response time improvement of signatures. This2. Background and Related Work technique suffers from similar kind of weaknesses as

    mentioned in other techniques.There is a number of file transfer application exists Lee [8] presents multilevel efficient signatureiThemarket1Sanumbev ofnfitranserbasis of different which he represents as a much faster technique for textin the market developed on the basis of different retrieval. If we critically review the technique

    techniques. Many of these are being used for discuse In thisrpaperlaccrding tou terest,ueprofessional purposes. Each of these applications is come to the same conclusionthat this signaturemethodcreated for slightly different purpose. For example, also lacks the same properties asmentioned above.there is a class of file transfer application which are Maddr [9] pres ts ab fentraner fori.called as "Secure file Transfer Applications". The The reliabie ife kind of errorpurpose of file transfer application is to send or receive ocr ring thefe transfe,at ca be saed fromfile only between the authenticated users etc. theus point ther than startiedbgni

    This section encompasses the existing file transfer thisape doent include cortionfree repintechniques and existing file signature methods. The as reliability. That is the definition of reliability isfocal point of this survey is to find the file transfer ilith. Th isis the on of thlrasons it

    techniqeswhihprovie relible trasfer. ncomplete in this paper. This iS the on of the reasons ittechniques which provide reliable transfer.denosaifourqiem t.Fu [3] presented a technique for files replicated on He [10 has p rosed Dse

    different sites. The authors are of that instead of aplcain The purposed of thse fil transfertransmitting whole file from one server to the other to applicationi dfe frome oth ealans.check integrity of the files, the can use file signature application is desine for bul dtaer forto accomplish the same task. This paper talks about high speedcnetworks.eSogthisfapplication isamuchrmorcreating the signature against whole file, but the differedntfo our domai. This techniqueis just antechnique they have proposed lacks the identification different purpos techniques for flof exact location of error in the large files. The transfer.signature method is only capable of identifying that the The survey depicts that there are lot many differenttwo files are not exactly same, that is, one is changed. techniques exist for reliable file transfer over the

    Chang [4] presented a technique for multimedia network and many of those use signature filedatabases. The authors have used signatures for the mechanism for the sake of providing reliability. Theresake of identification of the image icons of the is one commonly seen problem in most of thedatabase. Signatures used for this purpose lack in the techniques, that is, the defiition of reliability is sort ofsense that these signatures are capable of telling that incomplete. Due to this incompleteness of thethe icons are not same but these signatures may not definition, the techniques also remain silent on thedescribe the exact location where two icons differ from portions which are not considered as a part ofeach other. That is, the information provided by the defiition. There are different perspectives for all thesignature is incomplete in sense that they tell that there existing techniques and their area of application mayis some problem but they are unable to tell that exactly differ a lot but the underlying purpose of transferringwhere the problem exists. file is the same which keeps them in single category.Chen [5] has presented signature method for We have tried to identify and overcome the problemsmultimedia objects. The scheme uses hierarchy of found in the existing techniques. One of the biggestobjects and signatures are generated in the light of that problem we see is that the techniques are not eitherhierarchy. This scheme may cause reduction in the disk emphasizing on reliable transfer or techniques do notaccess but when it comes to identify that match is not fulfill the requirements of reliability. The techniquefound due to this part of hierarchy, this techniqueremains quiet. One of the reasons of this fact iS that the prosdbusmniednthnxteconresosignatures used for this technique are not capable cp pwt h bv etoe rbes

    Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.

  • 3. Proposed Reliable File Transfer The method of identifying corruption at theTechnique receiving site uses the similar technique. The algorithm

    at receiving site first identifies the actual size of the filereceived. Then it separates the signatures from the

    Wendlingpouptios anteniqe withsfer.Our maint p oseis to received file. After doing this process file only containshandling corruption in flei the original data with appended zeros and 16 reserveddetect the exact location where the corruption in file bye.Tesgausarsprtdfomheil.hshas~~~~~ocu.daeevn n.I eko h xc bytes. The signatures are separated from the file. Thishas occurred at receiving end. If we know the exact aloih ten gin eeres inarsofhslocation of corruption, we will be able to ask the original file and compares the signatures with receivedsending site to resend only the part of file received as signature If signatures eit meansvth

    coffupted. ~~~~~~~~~signatures. If signatures exactly match, it means thecorrupted. file is received without errors. If match is not found, itAlgorithm: Generate file with Signatures means that the file is corrupted. Now the question ariseInput: User File in ASCII (Fo) that what should be done now? Should we retransmitOutput: User File with Signature appended at end of (Fn) the whole file again? Or try to identify the portion ofMethod: In order to apply hash function on each n byte block of file which is corrupted? If we consider it with thefile we perform the following steps to make (m mod n)= 0 ofFo:

    m-- Calculate Length of (F0) perspective of cost, as we have mentioned earlier thatn *- Length of Block (any one of128 256 512 1024 20481 we target users who are interested in larger files,4096 8192 bytes) resending whole file from scratch will not be betterres *- reserved 16 bytes option for large file sizes. Even for files of small sizes,P - m modn this is not a recommendable option to resend wholeQ - n- (P + res) file. So we try to identify the problematic area in theif(Q > 0) file and try to ask sender application about only that

    EsF(Q

  • corrupted block which will be requested to the sending performed comparison of different signature generationapplication to be resend. methods and select the best suitable signature

    Input File before Received File after generation method on the basis of results generated bya endin Signatures Extractin Si natures different methods.

    = =f s-_, rvu Alrl- File-p-cffffl. lmtillfflspl-3.C.flf lIn the following scenario, we used a 22-byte ASCIITrinserCNPI value and generated signature on it. Then we madeResend little changes in the input string and stored the

    Block 6 signatures generate y e prototype.imC lota , 11f ei.. S ""l ie1 i.) H.i.(ifn A0n. i:3. (fi:.1-1Resending Following is the original string posed to the

    BSLt'r Llock, noU6rlset--fhniq- IN tleFZR,f Blc FoS qu UlFehi algorithm; input and result generated (in ASCII andbinary) by the algorithm are as under:

    Signatures Generated at Signatures Generated atSending Site Receiv-n Site Input: "This is a test program"

    Si1(W 1 1C 1oDM 1_ Ct) - Sif 10f101010f10100v11011SI0 1-V SIS2__ _1_0___11 - S2 Re1suhSg1l0 1 '0 101Reull; Sig =2422t = 1OO1O1 1011

    S3 - S384 - S4 Following are the different scenarios representing85 - S5S6 Gl= S variations in the original string.S7 @ioCioioi X11101001oPoi _ S7 '101i0101 1 111001010100001)SS - Se(==~~-

    S9 iiioiiiooooiioioioiiiol - SIiiioiiiooooiioioioi1ioco3 Scenario 1: Changing t to s of substring "test"Fig 1: Signature generation and matching process Input: "This is a sest program"

    Result: Sig= 8322 = 10000010000010The hashes generated by this algorithm are secure

    because on one hand, the algorithm generates the S nai2:Addin isiles sprac ahashes of variable length so it would be almost Result: Sig= 7922t= p 101m10010impossible for the attackers to identify the start and theend of a single signature, and to change it. On Other Scenario 3: Changing t to T of substring "test"hand, different block sizes in the above hash function Input: "This is a Test program"helps against the collision in the hash codes generated. Result: Sig= 122 = 1111010Another reason to this fact is that each hash code isgenerated on the basis of the ASCII character and its Scenario 4: Signature of Empty Stringposition in the block. Input: ""

    Result: Sig = 823 1 =1010000011000011 14. Empirical Evaluation Of Technique The results of all above mentioned scenarios show

    that a slight change in the string causes major changesWe have developed a prototypical application for in the hashes generated by the algorithm. This is the

    our technique. The purpose of this prototype is to strength of our proposed algorithm that it is capable ofprovide soundness to our idea of reliable file transfer. detecting even a minor change in the file and it alsoWe have critically evaluated our technique with two capable of generating hash for an empty string. Wemajor perspectives, testing of identification of exact have chosen this signature method among manylocation of corruption in the file and what block size is different options of signature methods. The reason ofbest suitable for transferring files by using our selection of this signature method is that is creates verytechnique. The purpose of first evaluation is to ensure different hashes for even small changes in the block.the reliability of our technique, whereas the purpose of So it is capable of detecting even a small change in anysecond evaluation is to utilize the client's resources of the blocks of file.efficiently and provide ease for the people using thisproposed technique. 4.2. Testing for the Right Size of Block:

    4.1 Integrity Testing of Technique: We use block size as a unit of identification of errorin our technique. This unit can vary according to the

    The signature generation and comparison provides desires of the users of this technique. Block size can beassurance that either received file has not been altered selected as large or small, but there are problems withor if it is altered then recognize exact location from both of these lengths. We have to identify the bestwhere received file corrupt. This test allows us to suitable size of block. In order to find the best suitablecheck that whether our technique identifies the exact block size we have to test our algorithm for differentlocation of error if error occurs or not. We have

    Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.

  • block sizes and the select the best suitable block sizes Index Terms: BS(Block Size), AS(Actual File Size), MS(Modifiedon the basis of results generated by our prototype. File Size) ST(Simple Transfer Time), C(Client), TM(Transmission

    We have performed testing of our technique in real Medium), S(Server), TT(Total Time), AT(Average Time).time environment under the following variables: Table 1 shows the testing results of our technique

    with different block sizes. The testing results areTest Data: generated using TCP. We have performed our test 6

    File Type: Word Document times with each block size and then calculate theSize: 1.0 MB average time, in nanoseconds, for transfer of each

    Test Systems: block size. The reason of making six different attemptsServer: is that the tests are performed in real environment and

    Personal computer: Pentium IV conditions of network may differ in different times. WeRAM: 256 MB have considered this factor in testing as well, so theHard Disk: 60GB tests have been performed in different timings.Internet Environment: connected at average40Kbps Processing Time/Block Size

    Client:II ~~~~~~~~~~~~500000Compaq Laptop: Pentium II 400000

    RAM: 128 MB 00000 NHardDisk: 4GB

    --TimeInternet Environment: Connected at average 200000

    O40Kbps 2 100000Table 1: Results of testin over TCP 128 256 512 1024 2048 4096 8192

    _BS 12 25 512 102 24 409 819 Block Sizel~~~~~ 8__ 2_8____________ __________

    ASl 1MB 1MB 1MB 1MB 1MB 1MB 1MB Fig 2: Test results graph representing (ProcessMS Cl104 .02M 1.01M 1.0 1.0 1.0 1.0 Tim/Block Size)

    ST 3864 3556 3864 3808 3703 3800 3624

    o 223484 100577 53010 28766 14837 7605 6830 The formation presented by the table above mightTM 32502 31586 31288 31275 31273 31308 6650 not convey itS message easily. So tO compreh1end thiS

    5 15217 679 62m84 162 52 48 inormation we have used thle graph1 as sh1own in figure40951 20035 12051 78386 56782 44142 15908 2 above. Testing results shown by graph clearly

    ............

    TTA7:5 2 9 mention th1at th1e 819 is bSest suitable block size.C 2346 1120 6539 8852 5919 7955 6436 Reason to this fact is that larger the size of block,

    TM 33100 31698 31275 32155 33238 30218 7176.2 144 lesser will be the amount of signatures generated,S 1 5470155 35041 18340 10468 5335 1428....6 ~~~~~~~~~~henceminmizing the overhead of signatures. On theTT4~0 210 385 737 592 308 100 other hand it may cause more overhead which iS

    o 72~6 997 617 286 50 50 67 created due to occurrence of errors in the data, becauseM 3 316 313 312 31 1 if the block size S larger it will represent larger amount_ 14~ 627 378 207 101 59 145 of data, and in case of occurrence of error the largerTT 402010 21 95 80 51 44 amount of data has to be retransmitted. But if we take

    Mt 10 1902M 51591 3000 1612 7655 6201 Time/Block*S*z.)

    C 22484109987 54015 30006 1612 7655 6301 small bork size, it Wil create overhead due toTM 32490 31634 31295 31078 31070 31029 10483

    1 7 68345 36531 19437 9879 5247 1434 up to the conclusion after the experimental results thatT 4027 198 129 80521 57071 43931 18118 8192 is the best size of the block. This fact is shown ino 40951 20035 59168 29407 15208 7571 3931 the test results and graph in the above section.TM 32519 31578 31298 31300 32200 31200 31276

    5 S 14665 66561 36643 19330 12001 5319 30509 C nI

    _TT 40102 19950 12710 80037 59409 44090 38257.on sn

    C 22381 12065 56248 27991 14591 7445 9225 . . .

    Relason to thS fact is that lrerethecostruct of flock

    TM32611 31672 31200 32005 32100 31007 23569 Reliabili in the essential consturus ofnfled6 15101 67013 35432 18345 10837 5315 2530 transfer over network. In this paper we proposed aTT 40943 21933 12288 734 55 43 54 technique for reliable file transfer and proven theA 6 20431 12541 7 52tegrity of our schemause wh therhelp of results

    22116 967 604 28679 1 528 7 5409 62877

    70~01 2c9generated by our prototype. We have provided ease tothe users of our technique by providing them variable

    Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.

  • block size for transfer. Users are facilitated by the Annual IEEE International Conference on Localempirical results of our prototype to identify the best Computer Networks (LCN'02), 2002.suitable block size for their situation. Reliability is [10].E. He, J. Leigh, 0. Yu, T. A. DeFanti, "Reliable Blastprovided with the help of file signature method devised UDP: Predictable High Performance Bulk Data

    .. . . - ............ . . . ...Transfer," cluster, p. 317, IEEE Internationalin this paper. The uniqueness of this method is that it is Conference on Cluster Computing (CLUSTER'02),capable of identifying the exact place of error in the 2002.file. This helps us saving our resources and time. Wehave used TCP as underlying protocol which is alreadyconsidered to be reliable, but the fact is that it does notensure reliability to great extent. The reason behindthis fact is that it uses CRC which is not secure in allthe cases and may not sometimes identify thecorruption. We have tried to overcome theshortcomings of the TCP reliability problem in ourtechnique. The Technique works on the ApplicationLayer and tries to identify the exact place of errors inthe file. It is designed in a way that it uses minimumbandwidth to retrieve heal the corruption. For example,if the file is received corrupt, it would only request forthat part of file which is corrupt and capable to join itwith the rest of the file.

    6. References

    [1]. C. Boris, R. Claudia, and S. Marie-Luise, "SemanticCache Mechanism for Heterogeneous Web Querying",Computer Networks, 31(11-16): 1347-1360, 1999.

    [2]. K. Kato, and T. Masuda, "Persistent Caching: AnImplementation Technique for Complex Objects withObject Identity", IEEE Transactions, July, 1992.

    [3]. A. W. Fu, and S. C. Chan, "Locating more corruptionsin a replicated file," srds, p. 168, 15th Symposium onReliable Distributed Systems (SRDS '96), 1996.

    [4]. J.W. Chang, and J. Srivastava, "Spatial Match RetrievalUsing Signature Files for Iconic Image Databases,"icmcs, p. 658, International Conference on MultimediaComputing and Systems (ICMCS'97), 1997.

    [5]. Y. H. Chen, A. J.T. Chang, and C. Lee, "ObjectSignatures for Supporting Efficient Navigation inObject-oriented Databases," dexa, p. 502, 8thInternational Workshop on Database and ExpertSystems Applications (DEXA '97), 1997.

    [6]. D.H.-C. Du, S. Ghanta, K.J. Maly, and S.M. Sharrock,"An Efficient File Structure for Document Retrieval inthe Automated Office Environment," IEEE Transactionson Knowledge and Data Engineering, vol. 01, no. 2, pp.258-273, Jun., 1989.

    [7]. Z. Lin, and C. Faloutsos, "Frame-Sliced SignatureFiles," IEEE Transactions on Knowledge and DataEngineering, vol. 04, no. 3, pp. 281-289, Jun., 1992

    [8]. D. L. Lee, Y. M. Kim, and G. Patel, "Efficient SignatureFile Methods for Text Retrieval," IEEE Transactions onKnowledge and Data Engineering, vol. 07, no. 3, pp.423-435, Jun., 1995.

    [9]. R. K. Madduri, C. S. Hood, W. E. Allcock, "ReliableFile Transfer in Grid Environments," lcn, p. 0737, 27th

    Authorized licensed use limited to: GMR Institute of Technology. Downloaded on April 24,2010 at 11:19:39 UTC from IEEE Xplore. Restrictions apply.


Recommended