Fortgeschrittene Nutzung der libjpeg - Hilfe!



  • Hallo alle,

    ich weiss nicht ob das hier zum Forum passt aber ich weiss auch nicht mehr wo ich sonst noch fragen könnte. Vielleicht hatte ja jemand schonmal ein ähnliches Problem.

    Es geht um folgendes, wir möchten in einem Projekt die JPEG-Compression durch die Nutzung programmierbarer Grafikkarten beschleunigen. Allerdings sollen nur bestimmte teile des JPEG Algorithmus ersetzt werden (Color Conversion, Downsampling, DCT, Quantization). Das Huffman Coding soll weiterhin durch die libjpeg passieren, genauso wie die ganze Headergenerierung etc. Hierfür haben wir nach langem suchen die Funktion jpeg_write_coefficients() gefunden, welche eigentlich genau das macht, was wir vorhaben, nämlich mit bereits quantisierten Eingabedaten eine Huffman Codierung durchführen und daraus ein standardkonformes JPEG genereieren. Das einzige Problem schein in der richtigen Eingabe der 3 Komponenten-Arrays (Y, Cb, und Cr) zu liegen, denn das resultierende Bild sieht so aus [img=http://aycu36.webshots.com/image/41635/2001922702231984863_rs.jpg] und nicht wie das [img=http://aycu30.webshots.com/image/41829/2001918893073253397_rs.jpg] was erwartet wird.

    Das ist der Quellcode, welcher das die Eingabedaten einliest und das JPEG final erstellt.

    Frame_t* CudaJPEG_t::compress(int quality, Frame_t* input){
    
      struct jpeg_compress_struct cinfo;// Contains all information needed to compress & code
      struct jpeg_error_mgr jerr;   	// a JPEG error handler
    
      //////////////////////////////////////////////////////////
      // Step 1: allocate and initialize JPEG compression object 
      cinfo.err = jpeg_std_error(&jerr); // setting up an error handler
      jpeg_create_compress(&cinfo); 	 // initialize the JPEG compression object. 
    
      //////////////////////////////////////////////////////////
      // Step 2: specify data destination 
      int jpeg_buffer_size = input->getWidth() * input->getHeight() * 3;
      jpeg_memory_dest(&cinfo, input->getCompressedPixels(), jpeg_buffer_size);
    
      //////////////////////////////////////////////////////////
      // Step 3: set parameters for compression 
    
      cinfo.image_width = input->getWidth(); 	// width
      cinfo.image_height = input->getHeight();  // height
      cinfo.in_color_space = JCS_RGB; 			// colorspace of input image 
    
      jpeg_set_defaults(&cinfo);				// default compression parameters depending on color space
    
      cinfo.input_components = 3;				// # of color components per pixel 
    
      jpeg_set_quality(&cinfo, quality, TRUE);
    
      //////////////////////////////////////////////////////////
      // Step 4: Start compressor
    
      //jvirt_barray_ptr is a pointer to a jvirt_barray_control structure
      jvirt_barray_ptr coef_arrays[3];
    
      // request the declared virtual arrays, method defined in jmemmgr.c: 
      // request_virt_barray (&cinfo, int pool_id, boolean pre_zero, JDIMENSION blocksperrow, JDIMENSION numrows,  JDIMENSION maxaccess) 
      // blocksperrow:  JBLOCK == JCOEF[64]; JCOEF == short;
      // maxaccess: apparently, this means the # of rows that minimally have to concurrently fit in the internal buffer
      coef_arrays[0] = (*cinfo.mem->request_virt_barray) ((j_common_ptr)&cinfo, JPOOL_IMAGE, FALSE, cinfo.image_width /  8, cinfo.image_height /  8, 8); 
      coef_arrays[1] = (*cinfo.mem->request_virt_barray) ((j_common_ptr)&cinfo, JPOOL_IMAGE, FALSE, cinfo.image_width / 16, cinfo.image_height / 16, 8);
      coef_arrays[2] = (*cinfo.mem->request_virt_barray) ((j_common_ptr)&cinfo, JPOOL_IMAGE, FALSE, cinfo.image_width / 16, cinfo.image_height / 16, 8);
    
      // This writes the header and allocates memory for the virtual arrays. The coef arrays should obviously be filled after this. 
      // Actual data is written by jpeg_finish_compress. 
      jpeg_write_coefficients(&cinfo, coef_arrays); 
    
      //////////////////////////////////////////////////////////
      // Step 5: The actual compression
    
      // Call CUDA
    
      int dctSize = (input->getWidth() * input->getHeight()) + static_cast<int>(0.5 * input->getWidth() * input->getHeight());
      char* dctBuffer = new char[ dctSize ];
    
      JPEG_cudaCompress(quality, input, dctBuffer);	  
    
      int max = -200;
      int min =  200;
      for(int i = 0; i<dctSize; i++){
      	if (dctBuffer[i] < min) min = dctBuffer[i];
      	if (dctBuffer[i] > max) max = dctBuffer[i];  
      }
    
      // write the Y Data 
      int blocksPerRow = cinfo.image_width /  8;
      int maxRows = cinfo.image_height / 8;
    
      for (int row=0; row<maxRows; row++){
     	// load the current row in the buffer! access function:   access_virt_barray (cinfo, jvirt_barray_ptr,start_row, num_rows, writable)	
      	// to get some idea of the return value: JBLOCKARRAY == JBLOCKROW*  JBLOCKROW == JBLOCK*  JBLOCK == JCOEF[64]
      	JBLOCKARRAY buffer = (*cinfo.mem->access_virt_barray) ((j_common_ptr)&cinfo, coef_arrays[0], row, 1, TRUE);
      	JBLOCKROW blockrow = buffer[0];
      	for(int block=0; block<blocksPerRow; block++){
      		for(int k=0; k<64; k++){
    			blockrow[block][k] = dctBuffer[row * cinfo.image_width * 8 + block*64 + k];
      		}
      	}
      }
    
       //write the Cb and Cr data
    
       int width = cinfo.image_width / 2;
       int height = cinfo.image_height / 2;
       maxRows = height / 8;
       blocksPerRow = width / 8;
    
       char* cbBuffer = &dctBuffer[cinfo.image_width * cinfo.image_height];
       char* crBuffer = &cbBuffer[(int)(0.25 * cinfo.image_width * cinfo.image_height)];
    
       for (int row=0; row<maxRows; row++){
    	  	JBLOCKARRAY bufferCb = (*cinfo.mem->access_virt_barray) ((j_common_ptr)&cinfo, coef_arrays[1], row, 1, TRUE);
    	  	JBLOCKROW blockrowCb = bufferCb[0];
    	  	JBLOCKARRAY bufferCr = (*cinfo.mem->access_virt_barray) ((j_common_ptr)&cinfo, coef_arrays[2], row, 1, TRUE);
    	  	JBLOCKROW blockrowCr = bufferCr[0];
    
    	  	for(int block=0; block<blocksPerRow; block++){
    	  		for(int k=0; k<64; k++){
    				blockrowCb[block][k] =  cbBuffer[row * width * 8 + block*64 + k];
    				blockrowCr[block][k] =  crBuffer[row * width * 8 + block*64 + k];
    	  		}
    	  	}
       }
    
      //////////////////////////////////////////////////////////
      // Step 6: Finish compression 
    
      jpeg_finish_compress(&cinfo);
    
      /* Compute the size of the compressed image.
      **
      ** This must be done after the image is done compressing, (i.e.
      ** after jpeg_finish_compress) but before jpeg_destroy_compress,
      ** because jpeg_destroy_compress frees the memory containing the
      ** information we want from the mem_dest_ptr within the cinfo
      ** structure.
      */
      {
        mem_dest_ptr dest = (mem_dest_ptr) cinfo.dest;
        input->setTranssize(dest->buffer_size - dest->pub.free_in_buffer);
      }
    
      //////////////////////////////////////////////////////////
      //  Step 7: release JPEG compression object 
    
     	jpeg_destroy_compress(&cinfo);
    
     	delete[] dctBuffer;
    
       	// set the compression to JPEG, so the standard JPEG decoder is used, since this class cannot decode itself.
       	// TODO: Everything should work with the correct compression set.
       	input->setCompression(COMPRESSION_JPEG);
    
    	return input;
    }
    

    Ich bin für jede Anregung äusserst dankbar.

    Grüsse

    Picknicker



  • Vergleich doch einfach mal in einem Hexeditor die Outputs.

    Außerdem mein Tipp: kopiere ersteinmal die Standard-Version von libjpeg in deinen Code und wandle ihn dann Schritt für Schritt nach deinen Bedürfnissen ab.


Anmelden zum Antworten